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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

1 0 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, far 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences, 

3* SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention,, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: I -3 786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1 -1 786 and 3 573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

15 specific domain or truncation of the peptides encoded by SEQ ID NO.l -1786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO:l -1 786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 

20 from the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID "NO:l-l 786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ 1DN0:1-I786and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array . In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 

30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PGR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ 1DN0:1-1786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment the nucleic 
acid sequences of SEQ ID NO: 1 -1786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:] -1786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature prolein coding sequences of SEQ ID NO: 1 -1786 and 3573-5358. The polynucleotides of the 

1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO:l-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 

20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1 786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybri dization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutical])' acceptable, carrier. 

The invention also provides host cells transformed or iransfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
1 0 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR. use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
15 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et aL Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other protems. For example, a polypeptide 
25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
35 expression or biological activity. 
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The present invention further reJales to methods for detecting the presence of the 

polynucleotides or polypeptides of ihe invention in a sample. Such methods can, for example, be 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention also provides a method for detecting the polypeptides of the 

10 invention in a sample comprising contacting the sample with a compound that binds to and iorms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 

activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
5 polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful Tor a variety of applications, as described 
herein, including use in arrays for detection. 

10 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
15 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

Tht term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3 5 -TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 

cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 

from the yolk sac. mesenteries, or gonadal j idges during embryogenesis that have the potential to 

differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 

5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 

modulates the expression of an operably linked ORF or another EMF. 

1 0 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

1 5 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic . 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-Iike or RNA-like material. In the 

20 sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

30 more preferably at least about 9 nucleotides, more preferably at least about ] 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

35 preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 

7 



BNSDOCID: <WO__ 0153312A1_L> 



WO I) 1 /533 1 2 PCJ7US00/34263 

nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
lDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, RS. et ah, 1992, PCR Methods Appl 1 :24 1-250). They may 

10 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et ah, 1 989, Molecular Cloning: A Laboratory Manual. Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons. New York NY, both of which are incorporated herein by reference in their 

15 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO.i-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-l 786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:]- 

20 3786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen~mer to be fully 

25 matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 

30 be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (l-r4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a rwenty-mer with a single mismatch can be 

3 5 detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
15 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 1 00 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

9 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 

ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 

attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 

substilution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 

occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, eg., 

recombinant DNA techniques. Guidance in determining which amino acid residues may be 

replaced, added or deleted without abolishing activities of interest, may be found by comparing 

10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathjc 

25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
giutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 

30 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 

the polypeptides of the invention. For example, such alterations may change polypeptide 

characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 

rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 

5 for expression, scale up and the like in the host cells chosen for expression. For example. 

cysteine residues can be deleted or substituted with another amino acid residue in order to 

eliminate disulfide bridges. 

The tenns "purified" or ''substantially purified" as used herein denotes that the indicated 

nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacteria] cultures, e.g., E. coli, will be free of glycosy/ation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a UNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (I) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into rnRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 

extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 

protein is expressed without a leader or transport sequence, it may include an amino terminal 

methionine residue. This residue may or may not be subsequently cleaved from the expressed 

5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 

transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 

express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic . 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 

20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. lnterJeukh>l Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2):J34 -143) and factors released from damaged cells (e.g. 
lnterleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 > 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C) ; and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-baseoligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 

1 0 35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 

20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably : nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 

25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 

30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 

35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers.to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
1 0 determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

1 5 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO:1787-3572 and 5359-7144; and a polynucleotide 

20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO: 1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to. a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 

25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above: or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1 787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

U 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g. ? cDNA and genomic DNA, and RNA : e.g., mKNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDN A . 

5 The present invention also provides penes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

I 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:M786 and 3573-5358 can be obtained 
by screening appropriate cDN A or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO:] -1786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 

15 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variationscan be routinely determined by comparingthe sequence provided SEQ ID NO:l-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO: 1-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ IDNO:l-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Too) 

1 5 is used to search for local sequence alignments (AJtshul, S JF. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode.proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 

16 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid 10 a charged amino acid), and then deletions or insertions 

may be made at the target site. Ammo acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

10 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et aL 5 
DNA 2:1 83 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(3982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant, PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gem 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et a)., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

17 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 

to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 

5 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 

synthetic) or KNA. Methods and algorithms for obtaining such polynucleotides are well known 

to those of skill in the art and can include, for example, methods for deteirnining hybridization 

conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

10 protein coding sequences corresponding to any one of SEQ ]D NO: 1-1 786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

15 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 

20 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 

25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO:l -1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 

30 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:l-1786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 

35 available for generating the recombinant constructs of the present invention. The following 

18 
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vectors are provided by way of example. Bacterial: pBs, phagescript PsiX174. pBluescript SK : 

pBs KS, pNHSa, pNHlba, pl\ T H18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 

pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG ; pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et a I., 

Nucleic Acids Res, 1 9. 4485-4490 (1991), in order to produce the protein recombinantly. Many 

suitable expression control sequences are known in the art. General methods of expressing 

recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

] 0 Enzymology 1 85, 537-566 (1 990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

15 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of IL coli 
and S, cerevisiae TRPl gene, and a promoter derived from a highly-expressed gene to direct 
uanscription of a downstream structural sequence. Such promoters can be derived from operons 
encoding gjycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vector? for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 

19 



BNSDOCID: <WO 0153312A1J_> 



WO 01/5331 2 PCT/USO0/34263 
transformation include E. coli, Bacillus subtilis. Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces. and Staphylococcus,, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter arid the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al., Nat Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

4.3 ANTI SENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-1 786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a doubJe-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 1 00, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO: 1 -3 786 and 3573-5358 are additionally provided. 

20 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the Tegion of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO:l-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA. but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
20 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
25 2-thiouridine, 5-carboxymethylaminomethyJuracil, dihydrouracil, beia-D-galactosylqueosine, 
inosine,N6-isopentenyladenine 5 1-methylguanine, 1-methylinosine, 2,2-dimethyl guanine, 

2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine,N6-adenine ? 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouraciL 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5-rnethoxyuracil 

30 2-methylthio-N6-isopentenyIadenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-metbyl-2-thiouraciL 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 

35 nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 

2! 
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inserted nucieic acid will be of an antisense orientation to a target nucleic acid of interest, 

described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 

the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

15 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 ct-anomeric nucleic acid molecule. An cx-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual (3-units, the 
strands run parallel to each other (Gauhier et al. (1987) Nucleic Acids Res 15: 6625-664 1 ). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucIeotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6133-6148) or a chimeric RNA -DNA analogue (Inoue et al (3987) 

25 FEBSLett 21 5: 327-330). 

4.4 R1BOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 

30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO.l- 

35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 

nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et ah "U.S. Pat. 

No. 4,987,071; and Cech et ah U.S. Pat. No. 5,116,742. Alternatively, SECX mRNA can be 

used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 

5 molecules. See, e.g., Bart el et ah, (1993) Science 261:141 1-14] 8. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulator)' region (e.g., promoter and/or enhancers) to form triple helical 

structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 

Anticancer Drug Des. 6: 569-84; Helene. et ah (1992) Ann. N.Y. Acad Set 660:27-36; and 

1 0 Maher ( 1 992) Bioassays 1 4 : 807- 1 1 . 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et ah (1996) BioorgMed 

\ 5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 

20 standard solid phase peptide synthesis protocols as described in Hyrup et ah (1996) above; 
Perry-O'Keefe et ah (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

25 PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping: as artificial restriction enzymes when used in 
combination with other enzymes, e.g_ SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et ah (1996), above; Perry-O'Keefe (1996), 
above). 

30 In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA -DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA -DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

35 enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA -DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5'-(4-methoxy(rityJ)amino-5'-deoxy-thymidine phosphoramidite. can be used between the PNA 

and the 5* end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3 ! PNA segment. See, Petersen e\ al. (1975) Bioorg Med Chem 

Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

] 5 cell membrane (see, e.g., Lelsinger et al, 1989, Proc. Natl Acad. Set. U.S.A. 86:6553-6556; 
Lemaitre et ai, 1987, Proc. Natl. Acad. Sci. 84:648-652; PCX Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g.. Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.^.. a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 

is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
5 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

10 The host cell can be a higher eukaxyotic host cell, such as a mammalian eel], a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
ca}cium phosphate transfection, DEAE, dexlran mediated transaction, or electroporation (Davis, 
L. et ah, Basic Me/hods in Molecular Biology (1986)). The host cells containing one of the 

15 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORE) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell. 

20 COS cells, 293 cells, and SIP cells, as well as prokaryotic host such as £. coll and B, subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 

25 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New- 
York (19S9), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

30 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

35 from in vitro culture of primary tissue, primary explants. HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaX or Jurkat ceJJs. Mammalian expression vectors will comprise an origin of 

replication, a suitable promoter and also any necessary ribosorne binding sites, polyadenylaiion 

site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 

nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

5 SV40 origin, early promoter, enhancer, splice, and poiyadenylation sites may be used to provide 

the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 

in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 

more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 

refolding steps can be used, as necessary, in completing configuration of the mature protein. 

1 0 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 

steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 

method, including freeze -thaw cycling, sonication, mechanical disruption, or use of cell lysing 

agents. 

Alternatively, it may be possibJe to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida^ or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include poiyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

26 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 

10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

3 5 selectable marker is linked to the exogenous DNA r but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

20 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Seiden et a/.: and International Application No. 

25 PCT/US90/06436 (WO91/06667) by Skoultchi et aL each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1 786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by : (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the ammo acid sequences 
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set forth as SEQ ID NO: 1 787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO: 1 787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 
65%, at least about 70%. at least about 75%. at least about S0%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO: 1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et ah, Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 

15 Chcm. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host ceil. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 

25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 

30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention {e.g.. an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

28 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of lar ger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from tine 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immiinochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et ah, in Molecular Cloning: A Laboratory 

35 Manual, Ausubel et ah, Current Protocols in Molecular Biology, Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments thai encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 

10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787~3572 and 5359-7144. 

1 5 The protein of the invention may also be expressed as a product of transgenic animals. 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 

20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 

25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 

30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 

35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art. as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1 555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

1 0 invention is "transformed . " 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

1 5 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavaiin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaflinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP~ HPLC) 
steps employing hydrophobic RP-HPLC media, e.g. , silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments. 

as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides 01 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 

5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 

may exhibit improved properties such as activity and/or stability. Examples of moieties which 

may be fused to the polypeptide or an analog include, for example, targeting moieties which 

provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immune cells such as T-cells, monocytes, dendritic ceils, granulocytes, etc.. as well 

10 as receptor and Iigands expressed on pancreatic or immune ceils. Other moieties which may be 

fused to the polypeptide include therapeutic agents which are used for treatment, for example, 

immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 

alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PS1-BLAST (Altschul S.F. et aL Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et aL, J. Comp. 

25 Biol., Vol. 6, pp. 21 9-235 (1999), herein incorporated by reference), eMotif software (NevilJ- 
Manning et al, 1SMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incoiporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 

correspond to all or a portion of a protein according to the invention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protein according to the 

invention. In another embodiment, a fusion protein comprises at least two biologically active 

5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 

linked" is intended to indicate that the polypeptide according to the invention and the other 

polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 

C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

1 0 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or wore domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiate disorders, e,g., cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens lo produce antibodies in a subject, to purify Ugands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques- For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current PROTOCOLS IN Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiely (e.g., a GST polypeptide). A nuclejc acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in- frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

10 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See. for example, 

1 5 Anderson, "Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357; 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RN A 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 

34 



BNSOOCID: <WO 015331 2A1_t> 



WO 01/53312 PCT/USOO/34263 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of D*NA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publication^. WO 94/12650, PCT International Publication!^. WO 92/20808, and PCT 

1 0 International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 

1 5 co- amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. A Iternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different ce]l-type specificity than the naturally 

35 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA. but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacteria] 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,07 1 to Chappel ; 
U.S. Patent No. 5,578,461 to Sherwin et al.; International ApplicationNo. PCT/US92/09627 
(WO93/09222)by Seldenet al.; and International ApplicationNo. PCT/US90/06436 

1 5 (WO91/06667) by Skoultchi et al, each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination art- 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention, lnactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invenlion also make possible the development, 
through, e.g.. homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

]0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/281 22, incorporated herein by reference. 

Transgenic anirnals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated lo alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4 JO USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 



4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either const itutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels: 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting: as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et aL Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
•determine biological activity, including in a panel oi multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids: as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
] 0 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
1 5 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. Jn the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 

invention is evidenced by any one of a number of rouiine factor dependent cell proliferation 

assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/ll,BaF3, 

MC9/G, M+(preB M+), 2E8, RJB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1 , Mo7e ? CMK, 

5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 

in: Current Protocols in Immunology, Ed by J. E. Coiigan, A. M. Kruisbeek, D. H. Margulies, E. 

M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3. 

In Vitro assays for Mouse Lymphocyte Function 3.1-3.19: Chapter 7, Immunologic studies in 

1 0 Humans); Takai et al., J. Immunol. 1 37:3494-3500, 1 986: Bertagnolli et al., J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et aL, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coiigan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coiigan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

20 include, without limitation, those described in: Measurement of Human and Murine lnterleukin 2 
and. lnterleukin 4. Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coiigan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991: 
dcVries et al., J.Exp. Med. 173:1205-1211, 1991;Moreau et aL, Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 

25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coiigan eds. Vol 
I pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991: Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human lnterleukin 11— Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coiigan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 

30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coiigan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-celJ clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 

35 Immunology, Ed by J. E. Coiigan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-lmerscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7. 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
I9S0; Weinberger et aL Eur. J. Immun. 11:405-4] 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 



4 J0.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
(issues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF). leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrornbopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture 01 in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition. 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al. 2 Differentiation, 48: 173-3 82, (1991); KJug et al. f J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et ah, 

30 Academic Press (1 997)). Alternatively, direcied differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et a). Proc. Natl. Acad. Sci, U.S.A.. 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et ah, Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and. consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis. e.g. in supporting the growth and proliferation of 
eryihroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e.. 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post iiradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81 :2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others., 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulqse colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. ]. 
Freshney. et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et a!.. 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cell? 
with high proliferative potential, McNiece, I. K. and Briddell. R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
1 0 Wiley-Liss, Inc., New York, N.Y. 1 994; Long term bone marrow cultures in the presence of 
stroma) cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
¥res\iney 7 et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 

present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such iissue is not normally formed, has application 

in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 

j humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 

protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 

use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 

a composition of the present invention contributes to the repair of congenital, trauma induced, or 

1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the an. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, bead trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, Jiver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection oi 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. 1. and Rovee, D. T., eds.) ? Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglslein and Mertz, J. Invest. Dermatol 
71:382-84 (1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp. ; malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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C rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome. 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein tor antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (eg., anaphylaxis, serum sickness, drug reacnons, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angi oedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 

10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et a). , Toxicology 125: 59-66, 

15 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 

20 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft- versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

35 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells., 
and thus acts as an immunosuppressant. .Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed. ; Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or N2B hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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C 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class ) or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class 1 or MHC class I] 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Rruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 - Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al, I. Immunol. 137:3494-3500, 1986; Takai et al, J. 
Immunol. 340:508-512, 1988; Bowman et al, J. Virology 61 :1992-1998; Bertagnolli et al, 

35 Cellular Immunology 133:327-341, 1991; Brown et al, J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-riependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033. 1 990; and Assays for B cell function: In vitro antibody production. 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-lnterscience (Chapter 3. 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., I Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 1 82:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1 994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-3264, 1989; Bhardwaj et al, Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993: Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 11-1 17, 1994; Fine et al.. 
Cellular Immunology 155:1 1 1-122, 1994: Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/1NH1BIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
aciivhies. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
1 0 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful a.s 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
15 animals such as, but not limited to. cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale ei 
ah, Endocrinology 91 :562-572, 1972; Ling et al., Nature 321 :779-782, 1986; Vale et al., Nature 
20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et ah, Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 



4.10.9 CHEMOTACT1C/CHEMOKJNET1C ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a prolein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
1 0 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APM1S 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al, Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CA1VCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 

52 . 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 

Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 

condition. Identification of single nucleotide polymorphisms associated with cancer or a 

predisposition to cancer may also be useful for diagnosis or prognosis. 

5 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis ^growth of new blood vessels that is necessary to support tumor growth) 

and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 

compositions of the invention may be effective in adult and pediatric oncology including in solid 

phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

1 0 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

15 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

20 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tvmor progression of human skin keraiinocytes, squamous cell carcinoma, basa) cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

35 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatia Carmustine, Chlorambucil. Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daiinorubicin HQ. Doxorubicin HC1, Estramustine phosphate sodium. Etoposide (V16-2]3) ; 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine. Mechloretharnine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MIX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HCh 
Streptozocin, Tamoxifen citrate, Thioguanine. Thiotepa, Vinblastine sulfate. Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylrnelamine. lnterleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
AnimaJ Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 92 1 -30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta etaL, Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 1 7:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 



4.10.12 RECEPTOR/LI G AND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-Iigand activity include without limitation those described in: 

Current Protocols in Immunology. Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach. W. Strober, Pub. Greene Publishing Associates and Wiley- lnterscience (Chapter 7.28. 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al. ? Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987: Bierer et ah, J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et ah, J. Exp. Med. 169:149-160 1989; Stoltenborg et ah, J. Immunol. Methods 

175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BlAcore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 1 82 (1 990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells,, either in viable or fixed form, can 

be used for standard binding assays. One may measure, for example, the formaiion of complexes 

between polypeptides of the invention or fragments and the agent being tested or examine the 

diminution in complex formation between the novel polypeptides and an appropriate cell line. 

5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 

increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 

organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

1 0 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads 55 via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 

1 5 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polylcetides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-6% (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides oi 

20 organic compounds and can be readily prepared by traditional automated synthesis methods. 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial. and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opw. 

25 Bioiechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
AJ-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et ah, Curr Opin Chem Biol 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-l 5 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 

30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 

antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g.. ricin or 

35 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
compJexed with imaging agents foi targeting and imaging purposes. 

5 4.1 0.3 4 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

1 0 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (re., increase or decrease) biological activity of a polypeptide of the invention. 

15 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BlAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit ant i -inflammatory activity. The 
ajiti-mflajTimatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

1 0 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

15 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a )ack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

1 5 tuberculosis, sy phi 1 is; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia. Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., 

choline acetyltransi erase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515): increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc.. 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 t assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 

30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms: effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 

nutritional factors or component(s); effecting behavioral characteristics, including, without 

limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 

5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 

than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 

deficiencies of the enzyme and treating deficiency-related diseases; treatment of 

hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 

as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 

10 in a vaccine composition to raise an immune response against such protein or another material or 

entity which is cross-reactive with such protein. 



4J0.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may he subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g.. 
by an antibody specific to the variant sequence. 



4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 

is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983. 

Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 

Induction of the disease can be caused by a single injection, generally intradermal^, of a 

suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 

1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 

24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 

described by J. Holoskitz above. An analysis of the data would reveal that the test compound 

would have a dramatic affect on the swelling of the joints as measured by a decrease of the 

arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 



4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important; parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It. is to be expected that the dosage will vary according to the age, weight. 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01 jig/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0.1 ^g/kg to 1 0 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline. Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

15 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines. or other hematopoietic factors such as 
M-CSK GM-CSF, TNF, IL-1, 1L-2, 1L-3, IL-4, 1L-5, 1L-6, IL-7, 1L-8, IL~9, IL-10, IL-11, 1L-12, 
1L-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-p). insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of thr 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or antithrombotic factor, or anti-inflammatory agent (such as 

10 JL-IRa, IL-1 Hyl, IL-1 Hy2, antj-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in rnultimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention oi 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially : 

the attending physician will decide on the appropriate sequence of administering protein or other 

active ingredient of the present inveniion in combination with cytokine(s), lymphokine(s) r other 

hematopoietic factor(s), thrombolytic or antithrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

1 0 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

1 5 Alternately, one may administer the compound in a local rather than systemic mariner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician^ provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a mariner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

1 0 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

1 5 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such par enterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 

liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 

treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 

optionally grinding a resulting mixture, and processing the mixture of granules, after adding 

5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 

particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 

preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 

gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 

carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol. anaVor titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

• Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 

20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional manner . 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as Jactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 

35 injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil. or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 

1 0 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 

1 5 retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 

20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 

25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80 ? and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 

30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 

35 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvent? 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 

5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 1 00 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

1 0 The phannaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkyl amine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the an. as disclosed, for example, in U.S. 
Patent Nos. 4,235,871 ; 4,501.728; 4,837,028: and 4,737 : 323, all of which are incorporated 
5 herein by reference . 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 

1 0 ingredient of the present invention with which to treat each individual patient. Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 

1 5 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 >.ig to about ] 00 mg (preferably about 0.1 pg to about 10 mg, more preferably 
about 0.1 pg to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 

20 topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrog en-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 

25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 

30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 

35 compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydrcocyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite. bioglass : 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 1 50 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

1 5 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethyl cellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-0), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age. sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor 1), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

1 0 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art. especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 

appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the lC 5 o as determined in cell culture (i.e., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 

35 cultures or experimental animals, e.g., for determining the LD 5 o (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 5 o and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the EDso with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975. in "The 

10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 

1 5 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
adrninistered using a regimen which maintains plasma leveis above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 

20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 ng/kg to 1 00 mg/kg of body weight daily, with the preferred 
dose being about 0.1 p.g/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 

25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the mariner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, F a b* and F( ab *)2 

1 0 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGj, lgG2 ; and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

3 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related pTotein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyle 
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Doohrtle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1983, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and DooJittle 1 982, J. 
Mol. Biol. 1 57: 1 05-1 42, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an imrnunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures knowi within the ait may be used for the production of polyclonal or 
10 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

15 5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, lnc. ; Philadelphia PA r Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 

10 gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

1 5 described by Kohler and Milstein, Nature. 256:495 (1 975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof ot a fusion 

20 protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 

25 transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT). the culture medium for 

30 the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 

35 can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, M&nassas, Virginia. Human myeioma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, 3. Immunol, , 133:3001 (1984); Brodeur et al, Monoclonal 
Antibody Production T echni ques and Applications , Marcel Dekker, Inc.. New York, (1 987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (R]A) or 
1 0 enzyme-linked inimunoabsorbem assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can. for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem. . 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-3 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DMA methods, such as 
those described in U.S. Patent No. 4,81 6,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated^ the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells. Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,81 6,567; Morrison, Nature 368, 
832-13 (1994)) or by covaJently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5 5.13-2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 

1 0 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 
binding subsequences oi antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature , 332:323-327 (1988); Verhoeyen et al., 

15 Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 

20 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 

25 immunoglobulin (Jones et al., 1986; Riechmann et al., 1 988: and Presta, Curr. Op. Struct. BioL, 
2.593-596(1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al, 1985 In: MONOCLONAL 
35 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et ah, 1983. ProcNatl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in virro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Theirapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques- 

including phage display libraries (Hoogenboom and Winter, J. Mol. Biol. , 227:381 (1991); 
Marks et aL, J. Mo). Biol. , 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 

10 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos, 5,545,807; 5,545 ; S06; 5,569,825: 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technoloev K), 779-783 (1992)); Lonbeig et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 81203 (1994)); Fishwild et al,( Nature 

15 Biotechnology K, 845-51 (1996)); Neuberger ( Nature Bi otechnology 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 

20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 

25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 

30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhurnan host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

1 0 U.S. Patent No. 5,91 6,771 . It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g.. 
Huse, et aL, 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b«)2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F (8 b»)2 fragment; (iii) an F ab fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen., and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin hea\7-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these bybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 

10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et aU 1991 EMBOl, 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody- antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 

3 5 the hinge, CH2. and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 

20 ah, Methods in Bnzvmology, 121:210 (1986). 

According to another approach described in WO 96/27011, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 

25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 

35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermoiecular disulfide formation. The Fab* fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-1"NB 
derivatives is then reconverted to the Fab : -.thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets . 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 3 547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab 5 portions of two 
different antibodies by gene fiision. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Aca d. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and V L domains of one fragment are forced to pair with the complementary Vl and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al, J. Immunol. 1 52:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an aim which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3 ; CD28, or B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRIJ (CD32) and FcyRIIl (CD! 6) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA. or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 



5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies arc also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 



5.J3.7 Effector Function Engineering 
It can be desirable to modify the antibody of the invention with respect to effector function, so as 

20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 1 76: 1 1 91 -1 3 95 (1 992) 

25 and Shopes, J. Immunol., 148: 291 8-2922 (1992). Homodimeric antibodies with enhanced anti- 
rumor activity can also be prepared using heterobifanctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby h^ve enhanced complement lysis and ADCC capabilities. 
See Stevenson et ah, Anti-Cancer Drug Design, 3: 219-230 (1989). 



30 



5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 
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Chernotherapeutic agents useful in tbe generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin. 
5 Aleurites fordii proteins, diantfrin proteins, Phytolaca americana proteins (PAPI, PAPI1, and 
PAP-S), momordica charamia inhibitor, curcin, crotin. sapaonaria officinalis inhibitor, geionin. 
mitogellin, restrictocin, phenomycin. enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi 3 ,31 I, l3, In ) 90 Y,and 186 Re. 

1 0 Conjugates of the antibody and cytotoxic agent are made using a variety of Afunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), Afunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde). bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

1 5 bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et aL, Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethyIene triaminepentaacetic acid (MX- 
DTP A) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 

25 conjugated to a cytotoxic agent. 



4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM: electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods lor recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

1 0 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats {e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Compute] 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et aL, J. MoJ. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al. : Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORPs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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^ software means for supporting and implementing a search means. As used herein, "data storage 

means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one 01 more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattem (EMBL), BLASTN and BLASTA (NPOLYPEPTIDE1A). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif." or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 

86 



BNSDOCID: <WO 0153312A1 J_> 



x WO »1 '53312 PCT/US00/34263 

P designee: 10 be complementary to a region of the gene involved in transcription (triple helix - see 

Lee et rJ.. Nucl. Acids Res. 6:3073 (1979): Cooney et ah, Science 15241:456 (1988); and Dervan 
et ah, Science 251 :1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oiigodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, 1 : L (1 988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA. while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

30 

4.16 D) A GNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe- or antibodies of the present invention, optionally conjugated or otherwise associated 
1 5 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, v. polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that annca] to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucjeotides. so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptiae of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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v 

y y probes or antibodies of the present invention. Examples oi such assays can be found in Chard. 

T. : An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1 986); Bul)ock : G.R. et aL Techniques in ]mmunoeytochemistry : 
Academic Press, Orlando, FL Vol. 1 (1982). Vol. 2 (1983V Vol.3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology. 
Elsevier Science Publishers, Amsterdam, The Netherlands (3 985). The test samples of the 
present invention include cells, protein or membrane extracis of cells, or biological fluids such as 
sputum. Wood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 

10 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 

15 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 

20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers, wij) include a container which will accept the test 

25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 

30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



4J7 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imagine of sites expressing the molecules of the invention (e.g.. where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et ah, U.S. Pat. NO. 5.413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

1 0 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 
1 5 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is. increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity /expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical aeents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

15 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et ah, Application of Synthetic Peptides: Antisense 
Peptides, M In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can he randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241 :456 (1988); and Dervan el 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while amisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both Techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



10 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l -1786 and 3573-5358. Because the corresponding gene is only 
• 15 expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4.683,195 and 4,965,188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a rhixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, earner or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be* 
achieved using passive adsorption (lnouye& Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1 985; Dahlen et al, 1 987; Morrissey & Collins, ( 1 989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (3 994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
strep tavi din-coated magnetic beads. Streptavi din-coated beads may be purchased from Dynal, Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 

Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville,IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5 r -end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussene/ a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecuies at the 5-endhas 
been described (Rasmussen et al., (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1 (8) 65 ] 3-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidatebond. the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidinused to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0.1 iM 1 -methyl imidazole, 
pH 7.0 (3 -Me]m7) ; is then added to a fina] concentration of 1 0 mM 1 -Melm?. A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M l-ethyl-3-(3-a^e%laminopropyl)-carbodiimide flEDC), dissolved in 

15 10 mM 1 -Melm7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxy) groups carried by the support. The ohgonucJeotideis then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodor et al{]99\) Science 25 1 (4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1991)Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) Anal. Biochera 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide^ a nylon support, as described by VanNess et al (1991), 
requires activation of the nylon surface via alky Iation and selective activation of the 5'-amine oi 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1994)PNAS USA 91(11) 5022-6. incorporated 
herein by reference). These authors used current photolithographictechniques to generate arrays: of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protected N-acyJ-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA. 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes . 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Sample: 

20 may be prepared or dispensed in multi well plates. About 3 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989). shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nuclek 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A levej 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease Cv/Jl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt endy . Atypical reaction conditions, which alter the specificity of 
5 this enzyme f Cv/J] * *), yield a quasi-random distribution of DNA fragments f orm the small 
molecule pUCl 9 (2688 base pairs). Fitzgerald ei al (1 992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/J]** digest of pUCl 9 that was size 
fractionated by a rapid gel filtration method and directly iigated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI** restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2*5 
ug); and fewer steps are involved (no prehgation. end repair, chemical extraction, or agarose gel 
1 5 electrophoresisand elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multi well plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broadej 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by- 
reference in their entirety. 

20 5.0 EXAMPLES 

5JJ EXAMPLE J 

Novel Nucleic Actd_ Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PGR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5* sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5 ' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acid? 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virginia.edtQ which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
(1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 1. 

5X2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was . 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready , ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 - 327. 
25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shirts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 
30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 3 -327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J, Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al, Nucleic Acids Res.. Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VL1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak. and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 ( 1 997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nuc leic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and inconrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 

25 UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 

30 The nearest neighbor results for SEQ ID NO: 328-1 4 1 3 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL. 3. Comp. 
Biol., Vol. 6 pp. 239-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position^) of the signature within the polypeptide sequence. 

Using the pFara software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(3) 
pp. 320-322 (199S) herein incorporated by reference) all the polypeptide sequences were 
1 0 • examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 3 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication M Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
deavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

25 5.3,2 EXAMPLES 

Novel Nu cleicA ci ds 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its correspondingprotein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FA STY and/or BLAST against Genbank (i.e., dbEST version ] 1 7, gb pri J 1 7, 

UniGene version 1 17, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrapand Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.), The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1 652. 
The nearest neighbor results for SEQ 3D NO: 1414-3652 were obtained by a BLAST? 
version 2.0al 1 9MP-WashU search against Genpept release 1 1 8. using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1 652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford. CA) (Wu et ah, J. Comp. 
BioL, Vol. 6 pp. 21 9-235 (1999) herein incorporated by reference), all the sequences were 
1 0 -examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sormhammer et al., Nucleic Acids Res. ? Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

20 Center for Biological Sequence Analysis. The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Bngelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel),a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 8, gb pri 1 1 8, 

100 



BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 PCT/US00/34Z63 
UniGene version 1 18, Genpept release 1 18). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washing! on) and ed-ready. ed~ 

ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 

resulting from these procedures are shown in the Sequence Listing as SEQ 1DNOS: 1653-1745. 

5 Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLAST? version 2.0al 

1 9MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The results showed 

homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 

which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 

1 0 with identifiable functions for SEQ ID NO: 1 653-1745 are shown in Tabic 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al. 3 J. Comp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

1 5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et ah, Nucleic Acids Res;, Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to cenain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavape sites are also 

25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites 1 ' Protein Engineering. Vol. 10. no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated irom the assemblage. Any frame 

shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 19, gb pri 1 ] &. 

5 Uni Gene version 1 1 9, Genpept re] ease 119). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready r ed- 

ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 

these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1 768. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 746-1 768. 

1 0 The homology for SEQ ID NO: 1746-1 768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologies 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1 999) herein incorporated by reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 

20 Using the PFam software program (Sonnhammer et ah, Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
i dentifying prokary otic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht Soren Brunak, and Gunnar von Heijne in the publication u 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites'' 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 
Novel Nucleic Adds 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full Jength gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FA STY and/or BLAST against Genbank (i.e., dbEST version 120, gb pri 120, 
Uni Gene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 769-1 786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0a) 
1 9MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologies for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Bioh, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated. by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. l,pp. 1-6(1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Oncm 



RNA Source 



Hysec 
Library Name 



SEC 2E> NC6: 



adult brain 



GIECC 



AB30O: 



adult brain 



9 19-21 50-51 6 
8S 67 107-108 1 
140 150-152 159 
202-203 212-214 
251 258 268-269 
298 301 321 326 
357 362 369 379 
443 459-460 473 
500 503 519 526 
608-609 613 618 
652 657-658 660 
695 697 710 715 
796 804 811 857 
900 912 519 922 
962 979 988-985 
1008 1016 1039 
1067 1070 1078 
1116-1117 1131 
1149 1151 1157 
1234 1241 1243 
1279 1286-1290 
1312 1320 1323 
1361 1368 1373- 
1400 1417 1446 
1494 1501-1503 
1517 1522-1524 
1549 1565 1578 
1623 1625 1627 
1649 1653 1664 
1734 1741 1743- 
1771 



5-66 72 78 80 8; 
13 116 123 138 
169 177 192-193 
225-226 235-236 
272 280-281 295 
331-332 334 356- 
382-383 416 423 
475 477 488 496 
547 574 582 587 
633-634 645-646 
669-671 678 667 
724 731 775-777 
-859 862 869 895- 
924-929 933 936 
996 1001 100^- 
1047 1059 106« 
1082 1107 1113 
1134-1137 1140 
1180 1206 122£ 
1258 1272-1273 
1294 1307-1308 
1330 1356 1360- 
1375 1379 1391 
1468 1482 1493- 
1506-1507 1512 
1530-1533 1537 
1598 1606 1608 
1639 1643 1646- 
1667 1671 1696 
1744 1760-1763 



GIECC 



ABD003 



3 12-14 16-19 25 30-31 34-36 43- 
45 50-51 56 58 60 65-66 68-69 BO 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 255 
261 268-269 272 276 280-281 284- 
288 291-252 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-361 
393 401 408 414 419 424 426-426 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 565- 
570 574-576 506-588 593 595 557 
601 606-609 615-620 622-623 62S 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 76S-766 
773 775-77B 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-96S 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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T; ssue Orient 



RNA Source 



HV3ec 
Library Kama 



SEO ID NOS: 



2097 


1103 


1107 


1109 


1112 


1116- 


1117 


1119 


1121 


1124 


1127 


1130 


1134 


1144- 


•1145 


1149 


1151 


U57- 


1158 


1167 


117C 


1178 


1184 


1186 


1190 


1193- 


1194 


1200 


1202 


:2is- 


1217 


1220 


122f - 


1227 


1225 


1231 


1241 


1243 


124 7 


1252 


1256 


1263 


1267 


1269 


127? 


1281 


1284 


1286- 


1289 


1293- 


-1294 


1306- 


-1307 


1312 


1316- 


-1320 


1326. 


1333 


1338 


1341 


1344 


1348 


1351 


1355- 


-1357 


1368 


1374 


1377 


1380 


1386 


1389- 


1390 


1394 


1400 


140S 


1414 


1422- 


1423 


142S- 


•1427 


1437 


1443 


1446 


1454 


1456 


1458- 


•1459 


1468 


1470- 


1472 


1478 


1482- 


•1483 


1487- 


-1486 


1493 


1497 


14 99 


1506 


1508- 


-1511 


1517 


1522- 


-1524 


153 0- 


1533 


1545- 


1546 


1548- 


-1550 


1552 


1557- 


■1559- 


1563 


1565 


1567 


1569 


1571 


1586 


258e 


1591 


1593 


1595 


1598-1601 


1608 


1611 


1620- 


-1621 


1624- 


-1626 


1628 


1630- 


-1632 


1636 


1640-1641 


1644- 


1645 


1647 


1649 


1653 


-165S 


1657 


1664 


1667 


1669 


1673 


1678-1681 


1686 


1690 


1694- 


-1696 


1701 


1709 


1711 


1719 


1722- 


-1723 


1726- 


1727 


1731- 


1733 


1736 


1740 


1743- 


1744 


1747 


1749 


1753 


1757- 


-1758 


1760- 


1761 


1765 


1771 


1785 







acult brain j Clontech 



ABR001 



a cult brain 



9 29 68-69 113 
223 245 277 307 
344 348 352 362 
408 414 441-442 
506 517 586 597 
715 799 003 833 
882 908 920 937 
1027 1036 1041 
1112 1121 1127 
1147 1231 1236 
1320 1345, 1355 
1400 1417 1448 
1570 1572 1609- 
1626 1645 1653 
1786 



115 1 
320 
379 
454 
631 
865 
1000 
1043 
1136- 
1239 
1361 
1456 
1610 
1754 



46 152 206 
324 230-331 
384 393 404 
469 481 490 
641 659 691 
871 675 880 

1005-1006 
1075 1107 
1137 1144- 
1280 1293 
1383-1386 
1476 1507 
1614 1620 
1759 1770 



Clontech 



ABR006 



5-8 15-16 168 212-213 271 27B 
280-281 291-29: 300-301 310 314 
321 326 336-33P 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 669 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1252 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1665 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



adult brain 



Clontech 



AEROOB 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
60-70 72 75 77-80 63 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 268- 
172 174-175 181-164 188-190 193- 
194 196 198-200 202 204-205 207- 



BNSDOCID: <WO 01533l2At_l_> 



WO 01/53312 



PCT/USOO/34263 



Tissue Oricin I RNA Source 



Hyeec 
Library Name 



S2Q ID NOS: 



206 210 214-215 218 221-226 22i 
231-232 234-241 245-247 251-252 
25S 257-259 268-269 271 276-281 
285-286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 325-326 328 330-331 233-33E 
341 344-347 349 352 354 356-357 
362 365-373 376 379-380 382 384 
387 390-3S1 393-394 397 399-403 
405-411 414-415 417-420 426-42e 
437-438 440-444 453-455 462 464 
467 469-471 476 478 492-484 486- 
491 497 503 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 565-567 
572-574 577 581 585 587-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 645-647 
651-653 655-664 669-671 673 675 
682 687 689 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 756 758-759 
762-764 766 768 773-776 780-782 
784-785 787-789 794 796 799 802- 
803 80S 811 814-815 818 825-826 
834-837 839-840 842-B43 856-859 
861-862 865 867-872 874-875 881 
883-684.887 889-892 894-895 897- 
898 SOI 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941 943 945 949 953-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 995-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 1057 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 116C- 
1163 1167 1169 1172 1175 1177 
1180 1183-1188 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1259 1305-1309 1312 1314 1316- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
1448 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1480 
1482-1483 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1574 1578 1585-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1669 1671- 
1674 1676-1684 1686 16B9-1690 
1694-1696 1704-1705 1708-1709 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 



a cult brain 



acult brain 



adult brain 



adult brain 



RNA Source 



Hy^c 
Library Kame 



Clontech 



ABR •. - 1 



BioChain 



Invitrogen 



abr i : ;• 



SEQ 3D NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1752-1761 1765 1767 1771- 
1772 1776-1777 1779-1760 1786 



24 75 102 186 210 310-211 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



46 182-164 204-205 300 739 767 
1371 1545 1620 1684 



185 204-205 364-365 393 497 595 
687 692-694 830 645 1068 1320 
1413 164C 



Invitrogen 



187 301 357 364-365 375 454 463 
731 859 93S 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1736 



=dult brain 



Invitrogen 



ABRC 



419 434-435 441-442 763 789 983 
1320 



adult brain 
"adult brain 



Invitrogen 



ABR!.* 2 1- 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



Invitrogen 



ABTOtf, 



14-16 22-23 25 37-3 
70-72 78 86 94 107 
137 143 146 152 161 
194 196 158 210 218 
295 298 309-310 320 
338 346-347 349-350 
371 379-380 382-383 
399 401 408 428 438 
482 490 502 507-509 
557 562 597 602 607 
655 667 669 671-672 
696 710 712 715 721 
750 753 766 778 780 
814 826 830 837 841 
894-895 925 937 949 
961 963 968-969 988 
1005-1006 1016-1019 
1037 1052 1086 1090 
1115 1120-1121 1123 
1137 1140 1144-1147 
1170 1174 1188 1193 
1225 1229 1231 1254 
1280 1285 1309 1312 
1341 1343-1344 1356 
1378-1379 1383-1384 
1423 1429 1434 1442 
1452 1454 1470-1472 
1525 1528-1529 1532 
1554 1557-1559 1561 
15B5 1588 1590 1595 
1608 1610-1613 1615 
1627 1640 1644 1647 
1666 1670 1675 1696 
1723 1727 1738 1760 
1779 1785-1706 



9 43 58 60 
113 116 136- 
173 182-1B4 
229 259 267 
321 324 336- 
356-357 362 
391 393 396 
459 461 476 
516 526 531 
-609 624 652 
687-669 695- 
732 739 743 
781 789 803 
857 869 874 
954-956 960- 
989 1000 
1021 1036- 
1109 1113 
-1124 1136- 

1151 1167 
-1194 1205 
1258 1262 
1334-1335 
-1357 1370 
1403-1404 
1448 1451- 
1482 1499 
1536 1547 
1562 1567 
1601-1604 
1619 1624 
1660 1664 
1704 1715 
1761 1768 



cultured 
preadipocytes 



Strategene 



adpoc: 



5-8 11 17 25 68-69 80 82 87 103 
105 110 116 136-138 168 171 188- 
189 196-198 261 267 276 288 293 
301 318 331 336-338 379-380 391 
400 428 430-431 510-512 520 524 
527 549 557 561 602 618 620 622 
631 637 647 670 681-662 710 731 
748 782 793-794 817 834-836 843 
845 858-859 879 882 893-895 934 
960 982 986 995-996 1000 1002 
1005-1007 1025 1027-1028 1032 
1039 1045 1071 1078 1C97 1099- 
1102 1136-1137 1140 1219-1220 



• 308 



BNSDOCID: <WO ( 



0153312A1_L= 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin 



adrenal gland 



RNA Source j Kyseq 

I Library Name 



Clontech 



SEQ ID NOS: 



1260 
1322 
1370- 
1437 
1602 
1660 
1711 
1760- 



1272 

1325 

1371 

1466 

1606 

1662 

171S- 

1761 



1297-1298 
1335 1345 
1396 1408 
1466 1533 
1614 1631 
1673 1687- 
1720 1742 
1765 1767 



1314 132C 
1365-136e 
1423 1431 
1539 1594 
1649-1650 
1688 1696 
1746 1749 
1771 1705 



ADR002 



4-10 15-16 25 29-31 
51 55 60 62-63 65-66 
116 118 122 126 130 
170 181 192 198 201 
228 247 251 255 267- 
281 285 2S5 298 311 
349 351-352 354 372- 
391 400 410 415-416 
431 434-437 439 445 
477 483 491 493 497 
519 527 535 546 549 
581 588 595 600 602 
628-630 637 645-646 
713 715 719 732 734 
773-778 789 816 829 
869 875 863 B98 904 
930-931 942 948 952 
976-977 981 990 952- 
1004 1049 1055 1059 
1076 1112-1113 1115 
1134-1135 1151 1158 
1181 1188 1209 1218 
1227 1231 1243 1270- 
1280 1285 1290 1293 
1325 1327 1330 1342- 
1348 1365-1366 1369 
1387 1398 1400 1405 
1426 1436 1440-1441 
1463-1464 1488 1491 
1538 1546 1S67 1573- 
1598 1609 1614 1618 
1627 1634 1636 1649 
1671 1674 1678-1679 
1703 1717 1727 1731- 
1765 



43-45 47 50- 

75 80 102 
137 150 169- 
203 215 227- 
269 271. 280- 
336-338 342 
373 383-385 
424 426-427 
454 461 473 
498 503 516 
552 572-573 
608-610 620 
670 679 703 
744-746 758 
837 845 848 
912 S22-923 
965 567 969 
993 1001 
1071-1072 
1121 1127 
1163 1175 
1224-1225 
1271 1274 
1307 1324 
1343 1345 
1378-1375 
1417 1425- 
1444 1454 
1507 1512 
1575 1588 
1622 1624 
1651 1658 
1691-1692 
1732 1737 



adult heart 



GIBCO 



AHR0O1 



4-8 10-11 1 
46 50-52 57 
85 87 89 94 
110 112 114 
127 130-132 
147-151 153 
186 192 195 
215 220 225 
236 251 257 
277 280-232 
298-301 304 
325 330 333 
352 354 358 
384 387-388 
408-409 411 
433-439 445 
457 459 462 
483-484 487 
503 506 508 
526 534 536 
560-562 574 
587 589 593 
612 615-620 
645-652 656 
674-675 6e3 
701 709 712 



5-16 18-21 34-39 44- 
-58 60 62-63 71 75 82 
97 100 103-104 106- 
116 116-119 122-123 
134 136-138 141-144 
163-164 168-171 179 
197 199 204-205 212- 
-226 229-230 232 234- 
260 262 265 272 274 
285-286 289-292 296 
307 309 314 321 324- 
336-338 345 349 351- 
361 368 370 380 383- 
391 393 397 401 406 
-412 414-416 430-431 
446 449 452 
469 472-473 
-490 492-493 

510-513 516 519-522 
-540 542 546 549 553 
577 581-582 584 586- 
595 597 604-609 613- 
622-623 626 632 637 
-660 665-666 670-672 
-684 687 692-694 697 
715-716 719-720 725- 



454-455 
476-480 
496-498 
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BNSDOCID: <WO 0153312A1_I_? 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 



RNA Source 



j Kyseq 

} Library Name 



SEQ ID NOS: 



adult kidney 



726 728 730-732 735 
744 746 751 753 759 
771 775-780 785 788- 
604 810 812 817 821 
637 843 845-847 849- 
663-864 869 871 875 
683 887 890-892 894- 
501 903 $06-907 911- 
521-925 927-928 933- 
561-963 967 969-972 
580-986 990 992 999- 
1007 1010 1016 1019- 
1023 1025 1028-1037 
1043 1047 1050 1054- 
1059 1063-1064 1067- 
1072 1075-1076 1083 
1089 1093-1094 1104 
1109 1113 1116-1117 
1124 1126 1128 1131- 
1145 1148-1149 1151 
1169-1170 1175 1177 
1199-1200 1202 1206- 
1216 1218 1222 1227- 
1235 1238-1241 1243- 
1248 1250 1253-1254 
1261 1268 1270-1271 
1282 1287 1292 1298- 
1308 1317-1321 1324- 
1332 1334-1337 1339 
1349-1350 1354-1356 
1365-1366 1369 1371 
1378-1380 1383-1384 
1400 1403 1409 1417 
1437 1439 1442 1444 
1450 1453 1468 1470 
1481 1488 1490 1501- 
1521 1524 1528 1530- 
1537 1539, 1541-1542 
1555 1560* 1565 1567- 
1591 1597-1598 1601- 
1614-1616 1619-1620 
1630-1632 1634 1636 
1645 1647 1649 1652- 
1662 16G7 1673-1674 
1684 1686-1688 1704- 
1711-1712 1717 1724 
1731-1733 1737-1738 
1744 3749 1754-1755 
1765 1772 1785 



736-739 743- 
761 765 770- 
790 796 802 
826 828 630 
853 857-661 
877-875 681 
895 897-858 
913 915 S15 
935 945 958 
975 977-578 
1002 1005- 
1020 1022- ■ 
1039-1040 
1055 1057 
1068 1070 
1085-1087 
1106 110E- 
1119 1121 
1134 1144- 
1158 1167 
1192 1196 
1208 121: 
1229 1232- 
1244 124 7- 
1256-1258 
1277 1260- 
1299 1306 
1325 1330 
1344-1345 
1359-1360 
1374-1375 
1389 1397 
1423-1426 
1446-1447 
1473 1475 
1504 1515 
1534 1536 - 
1547 15S:- 
•1571 1586 
1602 160t 
1623-1626 
1641 1644- 
1655 16E5 
1680-166: 
1705 1705 
1726-1727 
1741 1743- 
1760-1761 



GIBCO 



AKD001 



4-8 10-11 1 
45 50-51 56 
77 80 82 35 
104 107-108 
127-333 136 
147-154 157 
172 176 178 
201 203-206 
216 223-22B 
253 257-259 
272 274 276 
290 293 296 
307 311-313 
333 341 344 
359 362 364 
376-377 380 
401 404 407 
430-437 443 
45S 459 461 
474 476-477 



7-21 29-31 3 
-58 60-61 64 
87 92-94 97 
112 116-117 
-137 139-141 
161-163 165 
179 192 194 
209-210 212 
234-236 238 
261-262 265 
-277 279-281 
298-299 301 
321 325-326 
348-350 352 
365 36B 370 
-382 392 395 
409 414-415 
-444 446 449 
462 464 467 
480-481 483 



5-3S 42- 
68-69 75 
100 102- 
119 123 
143-144 

-166 169 

-197 199 
213 215- 
247 251- 
269 271- 
234-286 

-302 304 
329-321 
3S6 358- 

-372 374 
398 400- 
423-424 
451 453- 
469 471- 
487-488 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/533H 



PCT/USOO/34263 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult kidney 



490-431 493 497-505 
520 522 524 526-52S 
544 547 549 554-556 
567 571-576 578 582 
593 598-S99 601 604- 
615-619 621-626 632- 
645-652 655 660-664 
678-679 688 652-695 
713 717 719-720 727 
738 743 745-746 751 
763 765 771-773 775- 
788 793 795-796 800 
810-ei2 814-819 821 
834-838 842-84S 848- 
864-865 867 869 871 
836-687 889-891 693- 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-956 
964 569-970 972 976- 
908-990 992-993 995- 
1004-1008 1010 1012- 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057- 
1070-1073 1078 1085- 
1089 1092 1094 1097 
1107 1109-1112 1116 
1123-1125 1132-1135 
1143 1146-1147 1149- 
1154 1157 1159 1163 
1178-1179 1281 1183 
1200 1202-1204 1206 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257- 
1261 1267-1268 1270 
1281 1283 1287-1239 
1299 1306 1308 1311- 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 1367 1369 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-1431 1433 1437- 
1443 1445-1446 1448- 
1454 1459 1461 1465- 
14/5 1478. 1484-1488 
1493 1495 1497-1498 
1509 1512 1518 1521- 
1527-1528 1532-1533 
1541 2547-1550 1552 
1561 1565-1566 1568 
1578-1579 1583 1586- 
1591-1592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624- 
1632 1634-1636 1638- 
1644 1646-1649 1653- 
1664 1666-1667 1670- 
1679 1683-1684 1686 
1696-1699 1701 1709- 
1714 1716-1719 1723- 
1727 1733 1737-1738 
1744 1748-1749 1751 
1763-176B 1778 1780 



510-513 516- 
534 537-54 0 
560 562 564 
586-589 592- 
606 6CB-61? 
634 637-643 
669-672 676 
698 702 713 
731 735-736 
753 755 762- 
778 780 786 
803 805 808 
B26 829 832 
855 857-863 
874 876-883 
896 8S6-900 
918 920 922 
940-942 945 
960-961 963- 
978 982-98f 
997 999-1002 
1013 1026- 
1025-1031 
1044 1047 
1064 1066 
1086 1088- 
1099-1102 
1119 1121 
1140 1142- 
•1150 1153- 
1167 1170 
1192 1196- 
1211 1216- 
1227-1230 
1243-1244 
1258 1260- 
1272-1274 
1293-1295 
•1313 1317- 
1334-1335 
1353-1357 
1375 2378- 
1403 2405 
1423-1424 
1438 1442- 
1450 1453- 
1466 1474- 
1490 1492- 
1506-1507 
•1522 1525 
1537 1540- 
1556-1559 
1571 1575 
•1587 1589 
1600 1603- 
1613 1615- 
•1628 1631- 
1639 1641 
165G 16G2 
1671 1676- 
1691-1692 
1711 1713- 
•1724 1726- 
1741 1743- 
1760-1761 
1785 



lnvitrogen 



AKT002 



20-21 37-39 47 52 57 60 65-66 
68-69 80 104 207-108 122 130 133 
136-137 140 142-143 149 169 174 



1)1 



BNSDOCID: <WO 0153312A1J.; 



WO 01/53312 



PCT/USOO/34263 



Tissue Origin RNA Source 



Hysec 
Library Name 



SEQ ID NOS: 



362 197 227-228 235-236 244 2.51 
261-265 267 280-281 286 290 299 
301 304-305 305 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 387 392 401 414 416 421 430 
443 445 449 453-454 472 437-486 
504 506 513 516 519 522 528 536- 
540 546 554 S85 5B7 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
838 049-650 852-853 86S-870 881 
8S0-B92 698 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1066 
1073 1085 1099-1102 1107 1110- 
1111 1113 111S 1119 1126 
1136-1137 1146-1148 1153 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 
1356 1365 1378-1379 1403 
1419 1428-1429 1436 1446 14Se 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 160C 1519 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 1686 1709 1727 1740 
1776 



1134 
1155 



lass- 
ie 14 



adult lunc 



GIBCO 



ALG001 



lymph node 



4-8 14 37-39 44-46 
63 75 82 88 93 103- 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-S40 544 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
811 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 9B1 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1069 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 1598 1606 
1627-1629 i632 1642 
1S69 1676-1677 1684 
1731 1732 1737-1738 
1786 



50-53 56 62- 
104 113 125 

154 157 162 
-191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 45S 
-477 468 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
B37-83B 845 
866 680 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1522 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1746-1749 



Ciontech 



ALN003 



4 24 50-51 82 105 137 153 19e 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 362 421 
430 433 445 451 461-462 475 481- 
482 503 526 529 537-540 546-547 
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BNSDOOD: <WO__ Q153312A1J_> 



WO 01/53312 



PCT/US00/34263 



Tissue Origin RNA Source 



Kysec 
Library Maine 



SEQ ID NOS: 



621 626 64 9 
793 803 831 
85e 866 879 
1005-1006 10 
1117 1151 11 
12G5 1274 
1374 1377 14 
154S 1600 16 
1653 16 
1771 



1644 
1741 



679 719 
834-836 
905 913 
12 1038 
99 1204 
24-1325 
40-1441 
18-1619 
87-1688 



725-726 73b 
B38 844 857- 
928 963 976 
1050 1116- 
1226 1243 
1335 1355 
1447 1504 
1631 1641 
1691-1692 



GIBCO 



young liver 



ALV003 



adult liver | Ir.vitrogen 



ALV002 



5-8 II 20-21 46 50-51 58 65-6fc 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 35S 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 470 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 6C1-602 
607 621-624 628-630 632-633 637 
648 660 666-667 67B 697-698 70C 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 079 887 
893 898-90C 902-904 906-907 511 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1C89 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1275 
1283 1295 1317-1320 1332 1335 
1344 1359 1362-1363 1375 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1565 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 175f 
1760-1761 1763-1765 176? 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 454 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 762 
794 814 820 826 834-837 847 845- 
850 858 861 874 879 893 89B 904 
911 918 921-922 926 946 94 8 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1465 1482 1504 1524 1542 1547 
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BNSDOClD: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



Tissue Cricin 



RNA Source 



I Kysec 
Library Name 



SEQ ID NOS: 



1550 1567 1578 lb£± 1563 1594 
1597 1601-1602 1611-1612 1615 
1618-161S 1621 1625 1637 1645 
1647 1652 1654-1655 1660 1666 
1669-1671 1684 1706 1722 1737- 
1738 1742-1744 1760-1761 1753- 
1765 1772 1774 



adult liver 



Clontech 



ALVOO^ 



29 676 997 1063 1119 1536 1766 



adult ovary 



Invitrogen 



aovoo: 



1 4-18 20-23 29 35-4 
51 53-58 61-63 65-66 
77-78 80 82 85 87 89 
103-104 106-108 110 
122-124 126 128 133 
142 145-147 149-157 
170 174 177-178 180 
189 192-203 207 209 
221-224 229-230 234 
247 255 258 260-262 
272 274 277-281 284- 
295 299 301-302 304 
313-314 316 321 323- 
333 335-338 341 344 
356 358 360 362 370- 
379-384 387 390-352 
400 403 408-410 412 
424 426-427 430-435 
448-449 451 453-455 
471 473 476-479 481 
494 496-497 499-501 
514 516-517 519-520 
528-534 541-544 546 
554-555 561-564 566- 
572-573 575-576 573 
588 590-591 593 595 
605 607-613 615 638- 
630 632-633 636-640 
649-652 654-655 657 
677-678 681 683-684 
710 714-721 723 725- 
734-735 743-746 750 
763 765 767 772-773 
783-784 786 788 790 
800 803 805 809-811 
819 821-824 826 828- 
837-838 843-850 852- 
867 869 871-872 874- 
887-886 890-895 898- 
916 919-922 924 926- 
941 943-946 948-951 
961-964 966-967 970- 
985-986 988-990 992 
1001 1004-1009 1011 
1019-1020 1024-1025 
1033-1035 1037 1039 
1050-1051 1054-1060 
1067-1070 1072-1073 
1078-1079 1085-1086 
1094-1096 1096-1103 
1112-1117 1119-1120 
1131-1135 1142-1143 
1153 1156 1158 1163 
1169-1171 1173-1175 
1180 1183-1185 1190- 
1197-1200 1202 1205 
1219 1221-1226 1232 
1241 1243-1244 1247 
1254 1256-1258 1262 
-1268 1270 1275 1276 
1286-1289 1291 1293- 



0 42-48 50- 
68-69 73-75 
97 100-101 
113 115 118 
134 136-140 
161 166 168- 
182-186 188- 
211-215 219 
242-243 246- 
26S-269 271- 
286 200 290 
307 309-311 
326 330 332- 
349 352-353 
372 376-377 
394 397-398 
414-416 423- 
439 443-446 
462-463 468- 
484 487 489- 
503-505 509- 
522 524 526 
547 549 5S2 
567 569-570 
581 SC3 585- 
597 595 601- 
622 624-627 
642 644-647 
665 667-675 
652-695 697- 
727 729 732 
751 753 758 
775-77B 780 
791 794-796 
813-815 818- 
829 831-832 
857 859-864 
675 87£-e83 
910 912-914 
927 929-939 
953 955-958 
579 981-982 
995-997 599- 
1013 1.03 6 
1029-1031 
1041-1047 
1062-1064 
1075-1076 
1089-1090 

lice-iioe 

1123-1127 
1146-1145 
1165-1166 
1177-1178 
1151 1195 
1214 1217- 
1235 1238- 
1249 1252- 
1265 1267- 
1280-1283 
1294 1298- 



114 



BNSDOCID: <WO__0153312A1_I_> 



wo 01/5331: 



PCT/US00/34263 



Tissue Origin 



RNA Source 



Kysec 
Library Name 



3EQ ID NOS 



1299 

1323 

1338- 

1359 

1377- 

1394 

1427 

1443 

1463- 

1481 

1494 

1507 

1526- 

1538- 

1553 

1567 

1578 

1591 

1609 

1636 

1657 

1671 

1690 

1713- 

1726- 

1738 

1751 

1765 

1778- 



1306 

1327 

1339 

1361 

1379 

1400 

1429- 

1445- 

1464 

1484- 

1496- 

1511- 

1527 

1539 

1555- 

1569- 

1580- 

159S 

1611- 

1638 

1659- 

1673- 

1699 

1714 

1728 

1740- 

1753 

1767- 

1779 



1308 

1329- 

1341 

1365- 

1383- 

1404 

1431 

1450 

1466 

1485 

1498 

1517 

1530- 

1541 

1559 

1570 

1581 

1597- 

1621 

1641 

1662 

1674 

1702- 

1716- 

1731- 

1741 

1755- 

1768 

1783- 



1312 
1330 
1343- 
1366 
1384 
1416- 
1435- 
1453 
1468 
1488 
1501 
1519 
1531 
1546 
3561 
1572 
1587 
1598 
1623 
1643 
1664 
1676 
17 07 
1719 
1733 
1743 
1756 
1770 
1784 



1317- 

1332- 

1351 

1371- 

1366 

1417 

1436 

1454 

1470 

1491 

1504 

1521- 

1534- 

1546- 

•1563 
1574- 

-1588 
1600- 

•1630 
1645 
1667 

-1681 
1710- 
1723 
1735 

-1744 
1760 

-1771 
1786 



1321 

1333 

1356 

1375 

1389 

1422- 

1439- 

1459 

1474- 

1493- 

1506- 

1524 

1536 

1550 

1566- 

1575 

1590- 

1606 

1634 

1647- 

1669- 

1683- 

1711 

1724 

1737- 

1748- 

1762 

1776 



5-8 44-45 90-91 107-108 159 178 
311 351 414 4 76 503 545 574 624 
636 719 755 773 860 890-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



adult Placenta 



Clontech 



APL001 



placentc 



lnvitrogen 



APL002 



14-16 26 29 43 
106 116 135 171 
198 210 216 235 
309 329 334 339 
423 430 434-435 
491 517 522 631 
738 746 769 816 
858 916 948 953 
1005-1006 1013 
1068 1070 3086 
1160 1277 1285 
1345 1429 1435 
1486 1490 1512 
1592-1593 1602 
1664 1673 1675 
1746 1776 



60-61 

177 
-236 
359 
448 
723 
843 
-954 
1033 
1139 
1317 
1438 
1519 
1626 
1722 



79-80 103 
180 194 196 
272 290 299 
379-380 417 
454 483 490- 
725-726 728 
854-855 857- 
976 9B8-9B9 
1036 1064 
1144-1145 
1320 1343 
1454 1482 
1532 1549 
1647 1649 
1727 1730 



edult spleen 



3 5-8 12 15- 
44-45 57 60 
103 106 ioe 
147 152-153 
178-180 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574- 
611-612 620- 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-853 858 



G1BCO 



ASP001 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 2C1-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434- 
481 490-493 
530 534 536- 
576 582 592 

621 623 631- 
667 671 673- 
730 732 738 
774 78C 788- 

622 830 832 
862 866 874 



25 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 
309 312 
382 386- 
436 446 
500 503 
540 547 
595 604 
632 642 
675 6 84 
742-744 
789 794 
845 848 
879 882 
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BNSDOCID: <WO 0153312A1J.? 



WO 01/53312 



PCT/USOO/34263 



j Tissue Origin 



RNA Source 



Hysec ~T 
Library Name 



SEQ 3D NOS: 



921-923 926- 
-958 963 977- 
996-997 9SS 
1031 103£ 
1059 106& 
1094 1X02 
1140 1163 
1196 1215- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1466 
-1487 1498 
-1549 1553 
1631 1636 
1662 167C 
1686 1700 
-1741 1760- 
-1782 



884 906-908 912 919 
927 934 942 949 957 
978 963 990 992-994 
1005-1007 1010 1012 
1012-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 3190 
1220 1226-1227 1229 
1246 1256 1269 1271 
1301 1320 1322 1330 
1335 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485 
1512 1522 1525 1544 
1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1676-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



testis 



GIBCO 



ATS001 



5-8 10 26 30-33 47 
69 82 84-85 97 102 
139 150 152 154 156 
176-177 192 194 196 
227-228 247 255 258 
288-289 301 307 211 
349 370-372 392 298 
427 430-43:. 433 437 
469 473 477 481-482 
503 513 522 526 547 
564 572-573 575-576 
599-602 605 612 615 
637 647 645-650 656 
674-675 712 719-721 
738 744 746 773 780 
802 804 809 811 814 
843 645 848 859 866 
913 916 919 921 S26 
960 963 971 975 977 
993 1007 1016 1029- 
1035 1038-1039 1045 
1064 1070 1072-1073 
1097 1099-1102 1104 
1141 1149 1161-1162 
1209 1222 1227 1229 
1238-1239 1243 1253 
1289 1291-1293 1307 
1320 1330 1332 1338 
1373-1374 1379 1389 
1409 1423-1424 1430 
1443 1455 1484 i486 
1496-1497 1501 1505 
1527 1530-1531 1533 
1549 1563 2565 1567 
1577 1586 1591 1599 
1628 1630-1632 1636 
1649 1661-1662 1666 
1675 1684 1690 1699 
1717 1724 1T30 1737 
1767 177S 



50-51 57 66- 
113 119 137 

163 169 174 
-197 212-215 
261 282 285 
316 330 334 
410 415 426- 
446 454 461 
493 499 502- 
552-553 563- 
581-582 585 
617 620 631 
660 665 670 
723 728 731 
784 78.8-789 
826 831 837 
869 877 905 
929 937 950 
981 990 992- 
1030 1034- 
1059-1060 
1087 1089 
1108 1113 
1175 1208- 
1231 1235 
1285 1287- 
1311 1317- 
134S 136S 
1399-1400 
1435-1437 
1490 1493 
1509-1513 
1537 1546 
1569 157: 
1602 1625 
1639 1642 
-1667 1670 
1705 1712 
-1738 1752 



Genomic DNA 
from BAC 63I1B 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



686 1352 1412 



Research 
Genetics 
(CITB BAC 
Library) 



Genomic DNA 
from BAC 35316 



BAC002 



1411-1412 
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BNSDOCID: <WO 01S3312A1_L> 



WO 01/53312 



PCI7USWV34263 



| Tissue Crigxn 


RNA Source 


Kyseq : 


SEQ ID NOS: 






Library Name ! 





Genomic DNA 
from BAC 29316 



adult bladder 



Research 
Genetic? 
{CITE BAC 
Library.- 



BAC002 



| 135^ 



Invatrogen 



BLD00I 



Clontech 



5-8 17-16 22-23 33 37-39 56-5'; 
BO 93 100 120-121 169 201 23*7 
251-252 272 278 311 348 363 382 
413 415 424 430 443 483 502 542- 
543 562 564 6C7 616-617 626 635 
652 667 671 710 727 755-756 762 
773 766 789 837 840 866 893 896 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1634 1636- 
1637 1645-1650 1654-1655 1656 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



bone marrow 



BMD0 01 



3-8 11 13 18 29-31 33 35-36 4C 
43-45 47-4e 50-51 57 6C 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
255 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 461- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 £30 535- 
540 542 544-545 549 555 565 567 
569-577 561 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 682 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-776 780 785-786 789-791 
798 802 810-812 823-824 826 
£32-833 837-838 843-844 84B- 
855 858-859 866-867 869 878-880 
883 890-892 896 9C3 90S 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 972 
976 981 985 987 990 992 955 1000 
1002 100S-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1226 1134-1135 1142 1X44- 
1145 1163 1172 1178 1197 1155- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1360 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1433 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 



796 
630 
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BNSDOC1D: <WO 0153312A1_I_> 



WO 01/53332 



PCT/USDO/34263 



Tissue Origin I RNA Source 



Ky sec 
Library Name 



SEQ ID NOS: 



1506 3509 1513 
1526 2528 1531 
1546 1546-1549 
1557-1559 1571- 
1592 1597-1600 
1626-1628 1630- 
1638-5639 1641 
1653-1655 1661- 
1684 1686 1690 
1713-1714 1717 
1727 1737-1738 
1772 1781-1702 



152T 
1536 
1552 
1572 
1609 
1632 
1646 
1662 
1702 
1720 
1740 
1785 



T52T 
-1537 
1554- 
1581 
1614 
1634 
1647 
1676- 
1707 
1722- 
1758 
►1736 



1524" 

1543 

155£ 

1585- 

162; 

1636 

165: 

3681 
1711 
1722 
1767 



bone marrow 



Ciontech 



BMD002 



11 15-16 19 30 
83-84 93 99 103 
139 16.9-170 174 
212-213 219 222 
255 259 264 273- 
292 295 301 303- 
316 324 326 330 
353 357 350 370- 
397 403-404 414- 
429-430 433-436 
465-466 472 475 
520 523 525 531 
569-570 581 583 
601 616-617 621 
659 671 674-675 
719 728 734 737 
774-778 790 811 
836 854-855 859 
879 864 889 892 
990 992 998 100 
1042 1048 1051 
1088-1089 1106 
1157 1192 1200 
1236-1237 1260 
1285 1287 1295 
1324-1327 1330 
1347 1350 1353 
1369-1370 1373 
1383-1384 1394 
1413 1417 1425- 
1446 1459-1460 
1521 1536 1546- 
1574- 1578 1598- 
3631 1634 1646 
1658 1669-1670 
1688 1690-1653 
1704 1707-1709 
1723 1725 1727 
1738-1740 1743- 
1760-1761 1767 
1786 



31 35-36 68-69 75 
108-109 118 137 
177 180 190 193 
225-226 232 237 
274 284 286 290- 
304 307 312-313 
334-335 348 352- 
■373 384 386-387 
416 421 425-427 
440 444 451 454 
478 491 493 516 
545 548 552 566 
590-591 597-596 
641 650 652 656 
679 684 710 718- 
•738 742 761 765 
814 818 830 834- 
866 869 871 878- 
904 922-923 932 
1 1004 1016 1036 
1054-1055 105E 
1112-1114 1155 
1223 1227-1228 
1261 1282-1283 
1314 1317-1323 
1333 3341 1343 
1355-1357 1367 
1377 1379 138: 
1397 1400 1406 
1427 1438 1447 
1470 1493 150S 
1549 1560 1573- 
1600 3621 1626 
1645 3653 16S£. 
1683-1684 1687- 
1696 1659 1702 
1711 1720 1722- 
1729 1731-1731- 
1746 1752 175b 
1777 1761-178: 



~1 Ciontech 



| 73-74 503 922 1036 1711 



bone marrow 
bone marrow 



BMD004 



1 Ciontech 



EMD007 



S5-96 866 1320 1475 



adult colon 



Invitrogen 



CLN001 



17 56-56 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 401 485 503 510- 
512 550-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1037 1020 
1025 1027 1054-1055 1063 1066- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Orioir. 



RNA Source 



Hysec 
Library Name 



SEQ ID NOS: 



Mixture of 16 
tissues - 
mRNAs 



! 3462-1464 1512 1556 1583 15&7 

I 1594 1596 1614 1625-1626 1631 

| 1639 1645 1650 1675-1677 1667- 

{ 1686 1701 1713-1714 1724 1740 

! 1765 



Mixture of 16 
tissues - 
mRNAs* 



Various 
Vendors 



CTL016 | 401 1490 1686 



CTL021 I 312 782 1132-1133 1403 1712 1715 



Vanoue. 
Venders 



adult cervix 



BioChain 



2VX002 \ 1 4-8 11 13 18-21 25-26 

37-39 43 46-47 56 61 64 
73-74 82 85 94 100 103-2 
118 122 126 130 134 140 
156 163 170 179 181 186 
1S6 198 201-202 218-219 
1 231 257 266 276-277 285 
! 298 301-302 304 307 312- 
j 326 329-330 332 335 342 
| 362 371-372 376 379 381 
j 386 398 400 410 414 416 
j 426-427 430-431 433-436 
I 448 461-462 464 471-477 
j 483 491 493 496 503 506 
| 516-517 526 530 535 542 
j S47 557 561 572-573 575- 
1 582 585-586 588-589 593- 
j 602 604-6C5 607-609 612 
i 623 644 650 654 657-658 
' 670 672 680 683 691-694 
708-709 711 713 720-721 
731-732 737 745-747 753- 
765 771 774-777 780 790 
798 800 803 805 818 826 
832 834-836 843 847-848 
857-860 864-866 869 871 
B80 882 887 890-891 897 
905-908 912-913 916 91B- 
527 932 934-938 944 948 
958 963-964 967 969-970 
978-979 983 985 990 992 
1005-1007 1016-1017 1024 
1033 103G 103B 1045 1047 
1056 1066-1067 1071 1073 
1079 1082 1098 1113 
1134 1139 1146-1149 
1170 1173 1175 1177 1181 
1200 1202 1211 1234 1216 
1222 1225 1227 1232-1234 
1241 1243 1258 1264-1265 
1270 1279 1287-1290 1308 
1311 1316 1320 1323 1327 
1349 1353-1354 1360 1372 
1383-1384 1386 1394 1397 



1124 
1163 



30-31 33 
66 7j 
04 113 
147 153- 
192 195- 
222 229- 
286 288 
314 324 
352 358 
382 384 
415-420 
439 446 
479 482- 
510-513 
544 546- 
577 581- 
594 600 
615-619 
662-665 
698 706 
727 729 
754 760 
793 796 
828 831- 
8S1-8S5 
876 878- 
899-902 
919 922 
955-956 
972 976 
100C 
1027 

1 os;- - 

1075 
1125- 
1167 
1197 
122j- 
1240- 
126* 
1310- 
1345 
-1374 
1405- 



* The 16 tissue- mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) norma] adult kidney mRNA (Inviirogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Inviirogen), 5) normal fetal kidney 
mRNA (Inviirogen), 6) normal fetal liver mRNA (Inviirogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia Jymphablastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 



RNA Source 



Hysec 
Library Name 



SEQ. ZD N06: 



1406 


1416 


1425- 


-1427 


1431 


2436- 


1437 


1442 


1446 


1448 


1453 


1455 


1466 


1472 


1478 


1482 


1456 


1503 - 


1503 


1506 


1512 


1522 


1527- 


I52t 


1531 


1533 


1541 


1547 


1569 


1571 


1585 


1589 


1597- 


-1598 


1600 


1608- 


1609 


1614- 


1616 


1620 


1623- 


1624 


3626- 


-1628 


1630 


1638 


1641 


1642 


1649 


1653 


1656 


1662 


1667 


1669 


167-5- 


-1675 


1683 


1685- 


-1688 


1699 


1702 


17C9- 


•1710 


1715 


1717 


1722 


1724 


1729 


1731- 


-1732 


1735- 


1735 


1741 


1743- 


-1744 


1748- 


-1749 


1755 


1760- 


-1762 


1767 


1773 


1778 


1785- 


1786 













diaphragn. 



BioChain 



diaoo: 



137 282 289 730 780 986 1409 
1478 15S9 1614 



encothelioi 
cells 



Stratesene 



EDT00I 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50*51 53-55 57-56 
60*61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 108 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 152-153 
161-163 166-172 176-179 187 190 
192 194 196-201 204-207 210 212- 
224 220 224 229-230 233 235-236 
240-241 251-252 258 261-262 265 
267-269 272 276-277 279-281 284- 
285 288 290 295-296 301-302 310- 
311 313 316 321 325 329 331-333 
335 340-342 351-355 360 371 375 
380-302 364 387 390 392 397 400 
407-408 410 412 414 416 425-427 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
519 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 
572-576 579 581 585-586 589 593 
595 597 599 603 607-612 615-617 
620 622 626 630 632-634 638-641 
644 647 656-660 662-664 670 673 
678 680-682 692-697 707 709-710 
712-713 719 730 732 734 736 736 
743-746 751 759 768 771 773 775- 
778. 783 786-789 793 800 8C3 805- 
807 810-811 814 816-818 821-822 
B24 826 028-829 832 834-838 842- 
84S 848-850 854-860 862 864 



871 874 876-879 883 885 887 



869 
890- 



891 894-895 898-900 903 908 910- 
913 916 919-922 924 926-928 930- 
935 939 943 948-949 951-954 957 
959-961 964 969-970 973 975-97e 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 3050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-2087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 1205-1207 1211 
1216-1217 2219 1221 1225 1229 
1232-1235 1238-1241 1243-1244 
1246 1250 1253 1257-1258 1261 
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Tissue Oricin 



RNA Source 



Hyser 
Library Name 



SEQ ID NC* . 



126S 
1277 
1290 
1217 
1230 
1345 
1367 
1400 
1424 
1440 
1468 
1491 
1511 
1531 
1547 
1561 
1579 
1592 
1615 
1631 
1650 
1669 
1696 
1719 
1736 
1755 
1771 



-1266 1268 
1280-1263 
1293 1255 
-1320 1324 
1334-1335 
-1347 1350 
1369 1374 
1406 1408 
-1426 1428 
-1442 1448 
1472 1474 
-1493 15C1 
1516 152C 
1536-1537 
1549 1552 
-1565 1568 
1581-1563 
1597 1605 
1618-1621 
1634 1636 
1652-1659 
1671 1675 
1698 1703 
1722-1723 
1739-1741 
1760-1761 
1773 1776 



1270-1272 
1285-3266 
1296 1306 
1325 1327 
1338 1342 
1355-1356 
1376 137S 
1414 1417 
•1431 3434 
1450 1462 
1476 1487 
-1504 1506 
-1521 1526 
1539-1540 
1555 1557 
1571 1575 
1587-1588 
-1606 1611 
1624-1628 
1638 1641 
1664 1666 
-1681 1683 
1711 1715 
1726 2731 
1743-1744 
1765 1767 
1779 1783 



1274- 
1286- 
1312 
1325- 
134? 
1359 
1398 
1415 
143E 
1466 
-1468 
1509 
1525 
1546- 
-1555 
1578- 
159C 
1613 
1630- 
1643- 
-1667 
-1688 
-1716 
-1733 
1745 
-17G8 
-1786 



Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



efkoo: 



286 
1411 



686 129" 1303-1304 1352 
-1412 1754 



ecophaguc 



BioChain 



ESOOO; 



131-132 261 289 380 
1000 1007 13S7 



>03 860 092 



fetal brain 



CI on tech 



FBR00J 



62-63 89 112 126 194 322 336-338 
379 391 411 4 81 546 563 607 67S 
710 867 1012 1031 105< 1251 1262 
1320 1407 1643 1652 16B6 1731- 
1732 1746 1765 



fetal brain 



Clontech 



FBR004 



68-69 90-91 
362 374 403 
668 670 691 
1209 1216 12 
1387 1410 14 
1547 1593 



139 212-213 301 331 
436 611 645-646 659 
785 805 845 1163 
32-1233 1238-1239 
16 1430 3496 1536 



fetal brain 



Clontech 



FBROOb 



5-S 25 43 60 
00 87 92 101 
149 152-153 
207-208 210 
238 251-253 
301-302 307 
330 333-334 
357 370 373 
391-392 397 
411 417 421 
437 440-443 
476 483 488- 
513 516 51S- 
544 547 550 
590-591 595 
623 626-629 
657-658 660 
689 691-694 
710 716 720 
744 757-760 
806-807 810 
858 861 864 
894-895 890 
936 938 945 
959 961 963 



62-63 6 
103 10& 
251 168 
212-213 
266 272 
310 317- 
236-338 
377 379- 
3 99 402 
424 426- 
454 460 
489 495 
520 524 
S61 567 
597 604 
631 634 
665 669 
696-697 
728 732 
763 775- 
817-818 
671-872 
904 915 
550 952 
567 969 



5-66 70 72 

134 136 139 
171-172 175 
221-226 237- 
279-281 295 
318 321-324 
346-347 352 
380 382 384 
406-408 410- 
427 430 436- 
464 467 473 
497 508 510- 
530 537-540 
572-574 S82 
607-609 615 
636-640 655 
674-675 679 
659 701 706 
734 736 742- 
778 780 799 
826 839 843 
684 890-B91 
921-923 935- 
955-956 958- 
971 990 992 
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( Tissue Origin 



RNA Source 



Hysec 
Library' Name 



SEQ ID NOS: 



999 1001 1005- 
1016 1022 1024 
1035 1042 1047 
1065 1067 1070 
1114-1115 1119 
1151 1153-1156 
1172-1373 1178 
1190-1200 1211 
1226-1227 1229 
1253-1255 1258 
1270-1273 1281 
1314 1317-1520 
1339 1341 1344 
1371 1373 1376 
1386 1392 1396- 
1425-1426 1428- 
144C-1441 1448 
1502-1503 1507 
1519 1536 1544 
1559 1573 1589- 
1511-1614 1619 
1640 1651 1657- 
1693 1696 1703- 
1718 1720 1722 
1730-1733 1735- 
1742 1745 1755 
1767 1771-1772 
1786 



006 1 
1029- 
1048 
1082 
1131 
1160 
1184 
1216 
1231 
1260 
1287 
1326 
1350 
1379 
1398 
1429 
1466 
1511 
1549- 
1590 
1621 
1658 
1704 
1724 
1736 
1759- 
1777 



008 1013 
1030 1032 
1052 1056 
1089 1109 
1143-1145 
1163 1167 
1186 1188 
1222-1223 
1236 124S 
1262 1266 
1308-1309 
1334-1335 
1356 1369- 
1381-1382 
1419 1423 
1432 1437 
1470 1482 
1513 1516 
1550 1557- 
1598 1608 
1625-1626 
1676-1679 
1713-1714 
1726 1728 
1738-1739 
1761 1765 
1779-1780 



235-236 520 864 1066 11B8 15B7 



fetal brain 
fetal brain 



Clontech 



FBRS03 
FBT002 



Ir.vitrogen 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 1B1 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 S09 516 519 522 527 557 562 - 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-B15 625 
829 840-841 847 8S4-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 1118 11201128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 159S 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 



Invitrogen 



PKR001 



105 124 160 289 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



FKD001 



5-6 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin 



RNA Source 



Hyseo 
Library Na 



SEO 3D NOS: 



258 277 280 
371 397 392 
436 443 455 
563 572-573 
654 657-656 
798 821 633 
868 878 511 
992 1007 10 
1139 1285 1 
1371 1376 1 
1440-1441 
1618 1631 1 
1678-1675 1 



-281 307 
395 4 03 
469 500 
585 600 
660 679 
844 854 
929 958 
46 1087 
312 1331 
391 1422 
470 1543 
651 1654 
691-1692 



310 314 330 
422-423 431 
519 522 542 
619 623 650 
71S 731 780 
855 857 864 
960 969 990 

1103 1129 
1355 1369 
1425-1426 
1598 1601 

-1655 1669 
1733 1785 



fetal kidney 



Clontech 



FKD002 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



fetal kidney 



Invitrogen 



FKD007 



20-21 82 163 335 679 988-985 
1000 1227 1230 1320 1554 



letal lung 



Clontech 



flgco: 



tetal lung 



35-36 54 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 117e 
1182 1200 1206 1309 1311 1345 
1429 1453 1567 1576 1620 1686 



Invitrogen 



TLG003 



9 15-16 29 41 47 68 
102 124 137 152-153 
229 231 249 254 256 
300 325 333 344-345 
379 3B4 408 426-427 
468 475 483 488 493 
545 547 549 564 582 
660 662-664 670 673 
761 766-767 774 805 
864 875 923 932 937 
988-989 1014 1016-1 
1090 1097 1170 1185 
1216 1224 1258 1290 
1342 1347 1355 1369 
1414 1431 1438 1449 
1536 1547 1557-1560 
1601 1636 1644 1653 
1667 1671 1675 1680 
1739 1760-1761 1769 



-69 83 88-89 
165 196 224 
267 291-292 
352 373 376 
430 432 467- 
516 531 535 
602 623 644 
725-726 728 
830 852-853 
946 949 963 
017 1024 1027 
1200 1215- 
1309 1320 
1381 1413- 
1491 1512 
1567 1690 
1655 1662 
-1681 1706 



tetal lung 



""clontech 



FLG004 



103 276 
1614 165 



334 465-466 737 843 1131 
b 



fetal liver- 
spleen 



Coluuibic 
University 



FLS001 



3-11 13 
51 54 56 
77-80 82 
110 112 
135-139 
157 163- 
180 186 
200 202 
233-236 
255-256 
274 276 
293 295 
311 314 
332 342 
358 360 
386-387 
406 408 
437 439 
456 455 
487-488 
506 S0S- 
529 531 
553-554 
576 579 



15-21 2 
-58 60- 
-83 85 
116-124 
141 144 
165 167 
188-190 
206 210 
240-244 
258 261 
278 280 
299-301 
316 318 
344-345 
362 370 
390 392 
410-412 
442 444 
461-470 
490-491 
513 515 
534 536 
561-562 
581 583 



5 30-39 41-4 
66 68-69 72 
87 89 92-103 
126-127 130 
147-149 152 
-172 174 176 
193-194 196 
-214 219 221 
246-247 250 
265 268-269 
-281 284-286 
304 306-307 
320-321 326 
350 352-353 
374 376 378 
-393 400-401 
415 417 419 
-445 448 452 
472-479 481 
493 500-501 
520 522-524 
-540 542 547 
564 567-568 
585-597 599 



8 50- 
75 
105- 
133 
153 
178 
198- 
231 
-251 
272 
288 
309 
329- 
356- 
384 
403 
422- 
4 54 
483 
503- 
526- 
-549 
571- 
605 
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( Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEO ID KOS: 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 665-670 672 674-675 676 
681-682 684 690-695 697 702 708- 
710 713-714 716-719 725-728 730- 
731 734 736 738 740-741 743-746 
748 7S0-751 759-766 768 772 7<74 - 
777 77S 783-788 793 796 798 800- 
805 808 010-812 014 810-819 821- 
824 826-632 834-837 843-847 849- 
867 865-876 878-883 887 e89-895 
897-89E 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
960-961 963-965 967 969 911 974- 
978 980-983 986 988-990 992-993 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1026- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1062 1085-1087 
1085-1090" 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1206- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-135C 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-2437 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1526- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1588 
1591-1592 1594-1595 1597-1596 
1600-1604 1611-1612 1614-1615 
1617-1618 2620-1622 1624-1625 
1627-1628 1630-1632 1634-1635 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 1783-1786 



fetal liver- 
spleen 



Columbia 
University 



FLS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-133 
145 147-153 155 157 159 161-163 
156 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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BNSDOCID: <W0 0153312A1J.? 



WO 01/53312 



PCT/US00/34263 



Tissue Origin j RNA Source 



SEQ ID NOS: 



Hysec 
Library Name 



206 212-215 215-221 223 225-22S- 
231-232 240-244 246-247 250-253 
258-259 262 264 266-265 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 425-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOB 509-513 515-517 519-520 524 
526-531 534 S37-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 SIS- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-7C3 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-912 818 823-824 826-827 
832 834-837 839 843 846 848-856 
358-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 $01-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 956 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1005-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1146-1150 
1156 1158 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1206 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1276- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 140C 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 15C0-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/5331? 



PCT/USOO/34263 



Tissue Oricin I RNA Source 



Hysec, 
Mbrary Name ; 



SEQ ID NJOS : 



fetal liver- 
splefer. 



1597- 

leis- 

1641 

1661- 

1676- 

1691- 

1713- 

1727 

1744 

1763- 

1776 



1598 
1628 
1646- 
1662 
167S 
1692 
1714 
1730 
1748 
1764 
1779 



1600- 
1630- 
1649 
1664 
1633- 
1699 
1717 
•1733 
1752 
1767 
1783- 



1601 

1631 

1652 

1667- 

1684 

1702 

1719 

1738 

1758 

1769 

1786 



1611- 
1635- 
1654- 
1669 
1666- 
1707 
1722 
1740 
1760- 
1772- 



1612 

1638 

1655 

1674 

168f 

1711 

1726- 

1743- 

1761 

1773 



Columbia 
University 



flsoo:- 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 552 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



fetal liver 



Invitrogen 



flvoo: 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 20C- 
204-206 210-211 220 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 £34 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 017 829 037 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
lOOe 1014-30X5 1020 1042-1043 
1070 1086-1087 1089-1090 3118- 
1115 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 1362- 
1363 1403 1405 1415 1419 34.25- 
1426 1429 1431 3442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-3550 1557-1562 3577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 
fetal liver 



CI on tech 



FLVOO > 



676 998 171$ 



CI on tech 



FLV004 



93 133 214 301 355 374 375 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



fetal muscle 



FHS003 



26 37-39 50-51 
113 128 131-132 
194 198 201 206 
261 276 282 286 
376 379 383 398 
436 438 452 462 
519 529 561 569 
607 623 626 635 
725-726 730 733 
826 837 860 874 
970 980 986 988 
1001 1007 1014 
1045 1060 1064 



58 84 
139 
211 
3 02 
412- 
-463 
-570 
647 
761 
913 
990 
1027 
1070 



86 89 98 
155 172 186 
230-231 256 
325 359 361 
413 419 430 
473 477 503 
590-591 597 
660 672 715 
775-777 786 
915 921 935 
992 1000- 
1035-1036 
1083 1097 



Invitrogen 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/5331? 



PCT/USOO/34263 



Tissue Origin I RNA Source 



Hy sec- 
Library Name 



SEQ 3D NOS: 



1099- 

1173 

1266 

1324- 

1383- 

1433 

1557- 

1632 

1712 

i76e 



1164 
1256 
•1320 
136? 
1409 
1554 
1620 
1675 
1754 



1102 
1196 
1270 
1325 
1364 
1505 
1559 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1514 

1562 

1650 

1726 



1117 

1228 

1298 

1336- 

1400 

1542 

1589 

1652 

1743- 



1123 
1240 
1317 
1337 
1403 
1551 
15SS 
1671 
1744 



fetal muscle 



Invitrogen 



fmsoo: 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



fetal skar 



Invitrogen 



fskoo: 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 233 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 I97 r 198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-405 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 52G 531 537-540 547 549 56C- 
561 567 572-573 581 584 589 612- 
612 615 623 630-631 635 647 645 
651 657-658 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-626 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 3018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1S59 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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BNSDOCID: <WO 01S3312A1 I s 



WO 01/533.1? 



PCT/USOO/34263 



j Tissue origin 



r 



RNA Source 



Hysec 
Library Name 



SEO ID NOS: 



1626 1632 1634 

1644 1646 1654 

1665 1668 1675 

1702-1703 170* 

1724 1727 1731 

1742 1747 1749 

1765 1772 1776 
1786 



J636 1641 1643- 
1657 1660-1662 
1685 1687-1689 
1710 J716 1715 
1732 1737-1740 
1755 1760-1761 
1777 1779-1780 



fetal skin 



lnvitrogen ' 



FSK002 



fetal spleen 
umbilical corla 



13 286 302 3C 
339 341 354 370 
408 414 426-427 
515 544 585 598 
1076 1109 1155 
1333-1335 1343 
137i 1377-1376 
1466 1647 1656 
1688 1693 1716 
1732 1739 1751; 



313 321 330 335 
372 385 400 402 
433 436 450 454 
767 810 845 939 
1317-1320 1326 
1347 13S0 1369- 
1391 1397 1422 
1678-1679 1687- 
1721 1725 1731- 



* FSP001 



110 137 211 353 589 927 1108 
1639 1771 



BioChain 



BioChain 



FUC001 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 62 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 152 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-38C 386 39C 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 S61 574-577 586- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 72S- 
727 732 749-750 762 765 771 775- 
777 780 789-751 793 796 802-803 
S14-B17 £22 833 843 845 848 858 
861 864 875 E79 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 94e 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 10B9 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1296 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1S7B-1579 1591 1595 1600- 
1601 1606 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1650 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



fetal brain 



GIBCG 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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BNSDOC1D: <WO 0l53312A1J_s 



WO 01/53312 



PCT/USOO/34263 



Tissue Oragin 



RKA source 



Kysec 
Library Name 



SEQ 1 D NOS • 



72 75 77 80 82 65 90 
102 107 110 Ua-116 
123 126 128 134 136- 
153-155 157 161 165 
181 186 186-18S 197- 
208 210 215 222-223 
235-7.38 24 0-241 24 7 
260-262 267-269 276 
286 289 298 30C-302 
321-323 325 330-331 

349 352 354 356-359 
371-372 377 379-380 

350 400 408 414-416 
434-435 438 44:-443 
455 457-463 47C 472- 
478 482-483 466-488 
496 499-500 502-504 
S12 516 519-520 522 
S30 537-540 543-544 
567 569-570 572-582 
591 593 5S5 599 601 
611-612 614-620 622- 
636 643 645-647 650- 
661 665 667-66B 670- 
681 687 689 692-694 
714 717 721 727 729 
738 743-746 75C-751 
770 772 775-777 784 
799 802-805 810-811 
824 826 830 83C-837 
856 8S8-860 862 864 
877 879 883 886-887 
095 898-901 905 908 
919 922-923 925 927 
938 948 952-960 963 
972 975 978-979 981 
990 992 995 997 999 
1005 1011-1013 1016 
1023 1026 1029-1031 
1038 1041 1047 1050 
1059 1064 1068 1070 
1078-1079 1081-1082 
1094 1097 1103 1107- 
1115 1121-1122 1127 
1138 1140 1143 1148- 
1156-1157 1159 1167 
1193-1194 1200 1202 
1211 121G 1219-1220 
122S 1232-1234 1240- 
1246 1249-1251 1253- 
1267-1268 1271 1276 
1285-1289 1293-1294 
1308 1312 1316 1320 
1339 1341-1344 1346 
1357 1359 1365-1366 
1373-1375 1379 13B6 
1398 1409 1413-1414 
1420-1421 1425-1427 
1437 1439 1442 1445- 
1457 1459 1463-1464 
1474 1477-1479 1489 
1497-1498 15C1-1503 
1511-1513 1517 1520 
1526 1531-1533 1535 
1547 1554 1556-1559 
1571 1584 1587 1589 
1601 1611-1612 1614- 
1620 1625-1628 1630- 
1637-1638 1640-1643 



431 
4 53- 



05- 



-91 94 100- 
lie-119 122 
140 147-148 
16S-172 175 
196 204-206 
22S-226 230 
253 256-258 
279-281 284 
307 310 316 
339 341 346 
362 364-365 
382 384 387 
419 424 
449 451 
473 47S 477 
490-491 493 
506-507 
525-526 525 
546-547 566 
585 588 590 
604 606-609 
624 630 632 
652 654 655 
672 676 678 
697 699 710 
732 734 736 
759 763 766 
7B9 791 796 
814 819-821 
839-850 854 
869 871 876 
890-891 893 
910 912-916 
930-933 935 
964 967 969 
983 986-987 
1002 1005- 
1C18-1019 
1033-1035 
1053 1057 
1072-1072 
1086 1089 
1109 1113- 
1134-1135 
1151 1153 
1170 117S 
1207-1209 
1226-1227 
1241 1243 
1254 '1258 
1279 1282 
1305 1307- 
1327 1338- 
1349 1355- 
1369-1370 
1389 1394 
1416-1417 
1430 1433 
1452 1454- 
1468 1470 
1492 1494 
1507 1509 
1521 1524- 
1537-1536 
1564-1567 
1594 1599- 
1616 1619- 
1631 1634 
1645 1648- 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCTAJSOO/34263 



Tissue Origan 



RNA Source 



Hysec 
Library Name 



S£Q ID NOS: 



1645 1652 1653-1655 1657-1656 
1664-2665 1667 1665 1673 167B- 
1675 16£3-:i684 1686 1693 1701 
1704-17C5 1709 1713-1714 1717- 
1720 1724 1727-1728 1731-1733 
1737-1736 1743-1744 1752 1754- 
1755 17S7 3760-1761 1765 1772 
1779 178: 



HMP001 



110 204-205 503 634 678 855 
533 988-989 1379 1448 1504 



macrophage 



Invitiogen 



5-8 
878 



infant brain 



Coluir.oia 
University 



IB2002 



10 15-12 15-18 22-23 25 29 34 
37-35 43 47 50-51 54-56 58 60-63 
65-66 68-69 72-74 80 82-83 86 
88-92 57 100 102-104 106-108 110 
112-122 115-116 118 123 12B 130 
134-236 138-139 143 147-149 151- 
152 154-155 163 165-167 169 172- 
175 181-384 186 153-196 198 201 
203-2C5 209-210 214-215 222 224- 
226 231-232 235-236 239 246-247 
252 257 260 268-269 272 276-277 
279-261 286 288 2S1-292 295 298 
30C-302 304 307 310 313 321-323 
330-331 333-334 335 346-347 349 
352 356-357 362 371-372 377 379- 
380 382-384 392 357 401 406 408 
411 413-424 416 418-419 422 428 
430-431 434-435 436 443 449 453- 
454 461 464-466 465-470 472-473 
475-476 478 482-483 487 490 4 92 
494 457 503 507-508 510-513 516 
519-520 S24-526 530-534 536-540 
547 550-551 561 563-564 566-567 
572-576 579 581-582 584-587 590- 
591 S£3 595-597 607-609 611-613 
616-627 620 622-624 627 631 637 
641 645-647 650-655 657-658 660- 
665 667-675 689 691 695 697 699 
703 707 713-715 717 721 728-731 
733-736 735 743 745 751 755 7S9 
763 769-770 772 778 780-781 785 
788-765 793-794 799 803 808 811 
814 825-626 e30 834-836 840-843 
845 848-650 854-855 860 862 e64- 
865 870 872 875-876 878 886 88o 
890-892 e94-896 898 903-904 916- 
917 515 522-525 927-928 930-932 
934-536 928 941 945-946 948-950 
953-554 559-962 966-969 977 979 
981 986-990 992 997 999-1000 
1004-1006 1014 1016 1018-1015 
1024-1025 1033 1036 1047 1052 - 
1052 1C54-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-1082 
1085 1C89 1108-1113 1118-1120 
1123-1324 1130 1132-1138 1140 
1149 1252 1153-1154 1163-1170 
1272 1274-1175 2183-1184 1186 
1190 1293-1194 1196-1197 1195 
1204 1206-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1256 1258 1261- 
1262 1265 1274 1279 1281 1283 
1285 1267-2289 1254-1255 1305 
1307 1313-2314 1316-1320 1325 
1332 1341-1342 1345 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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BNSDOCID: <WO 01533i2Ai_l_; 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 



RKA Source 



Hysec 
jibrary Name 



SEQ ID NOS: 



infant brain 



1423 


1429- 


1431 


1435- 


-1436 


1435- 


1441 


1443 


1447- 


1449 


1451- 


1452 


1454- 


1455 


1457 


1459 


1463- 


1465 


1468 


1470- 


1471 


1475 


1479 


1482- 


1483 


1485 


1193- 


•1494 


1496 


1490- 


1499 


1502- 


1503 


1505- 


1507 


1509 


1522- 


152:- 


1525 


1528 


1531- 


1533 


1542 


1546- 


1547 


1549- 


-1550 


1554- 


1555 


1562 


1565- 


-1567 


1569 


1575 


1580 


1583- 


1586 


1588 


1590 


1592- 


1593 


1595 


1598 


1600- 


•1601 


1606- 


1610 


1612 


1614- 


-1616 


1619 


1621 


1624 


1626- 


1627 


1630- 


-1633 


1637 


1639- 


1640 


1642 


1644 


1647 


1652 


1654- 


1655 


1658- 


■1659 


1664- 


1.665 


1672- 


1673 


1676 


-1681 


1685- 


1688 


1693- 


1695 


1701-1702 


1704 


1708 


1717- 


1720 


1723- 


-1724 


1726- 


1728 


1733 


1735- 


1741 


1743 


-1744 


1752 


1755- 


1758 


1762 


1765 


1771 


1774 


1777- 


1778 


1786 









17-18 20-23 29 34 43 60 68-69 
78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-261 286 290-252 295 300-301 
310 322 324 331 334 339 346-347 
349-350 352 357 371 376-377 382 
384 403 408-409 414-415 453-455 
472 476 478-479 490 503 507 516 
520 530 534 536-540 551 563 572- 
576 585 587 590-591 593 595-596 
601 606 612 616-617 620 622-624 
650 652-653 661 66S 670-671 674- 
675 678 60S 725 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 532 935-936 946 95C 
954 962 977 979 997 999-100C 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
1288-12e9 1305 1314 2327 1333 
1344 1347 2350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 1557- 
1559 1S67 1572 1587 1595 1596 
1610-1632 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
1713-1714 1719 1757 1760-1761 
1765 1771 1778 



Columbia 
University 



IB2003 



infant brain 



Columbia 
University 



IBM002 



101 113 139 152 260 279 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1205 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



infant brain 



Columbia 
University 



IBS001 



10 12 119 175 279-261 321 334 
371 446 551 563 623 652 667 669 
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BNSDOCID: <WO 0153312A1J. = 
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Tissue Oriain I 


RNA Source 


Hysec j 






SEC ID NOS: 




! 


i ! 


Library Name 
















l 






€71 - 


672 


819 


949 


966 


j - ij 


113 








T, CI 


1188 1192-1194 


1196 


122 


; i 


i 

1 






1258 


1265 12 


71 1207 


1317 


-13 5 










1324 


-1325 1342 1423 


1440 


-144 


• 








144 8 


1471 1482 : 


525 


1532 


154 










1562 


1569 1588 1591 


1610 


16 j 










1647 


1649 1658 










lung, 


Stratecene 


I.FBOOa 


5-9 


17 20-21 25 


63-65 82 


94 


J 05 ! 


fibroblast 






153 


157 


197- 


198 


203 


207- 


208 


2 1 2 - i 








213 


223 


262 


26G 


233 


302 


321 


226 








333 


356 


370 


427 


430 


436 


446 


462 








472 


493 


498 


503 


516 


519 


527 


535 








537- 


540 


542-544 


562 


565 


567 


586 








5S9- 


600 


607 


615 


630 


647 


662- 


664 








692- 


694 


712 


719 


74 5 


74 8 


775- 


111 








754- 


796 


810 


837 


843- 


847 


849 


854- 








856 


869 


676 


903 


934 


953 


955- 


956 








964 


975- 


•576 


984 


1000 1005-10 


07 








1024 


-1025 1033 1039 


1053 


106 










1070 


1072 10B2 1112- 


•1113 


113 


t. 








1136 


-1138 1140 : 


1195 


1223 


123 










1233 


12' 


16 1279 1285 


1295 


131 










1320 


1334-1335 1343 


1427 


-142 


t- 








1446 


1478 1482 1493 


1504 


153 










1552 


1555 1567 1575 


1582 


15S 










1620 


1625 1632 1638 


1645 


165 


4 - 








1655 


1662 1680-1681 


1684 


168f 








1690 


1696 1702 1711 


1733 


174 




j 




1760 


-1761 1778 17B5 








lung tumor 


Invitrogen 


LGT002 


5-10 


18 


20-21 29 33 


-36 4 


0 4 3 


52 






54-55 €1 65-66 68-70 73- 


75 80 85 








88-89 93-94 


100 


103 


106- 


ioe 


112- 








113 


115-116 


118-119 


123- 


124 


126 








130- 


132 


135-137 


139 


-141 


143- 


144 








147- 


148 


151- 


-153 


155- 


-156 


159 


161 








164 


169 


171 


179 


-180 


1B5 


190 


192 








194 


196- 


-199 


203 


-208 


210 


212- 


214 








216- 


217 


21S 


222 


233 


240- 


241 


244 








246 


251 


-252 


255 


-256 


261- 


262 


266. 








272 


276 


-277 


279 


-281 


284 


286 


28e 








290 


295 


298 


301 


-302 


309- 


312 


317 








321 


329 


332 


341 


-342 


344- 


345 


348 








352 


358 


-360 


363 


368 


370- 


373 


376 








380- 


381 


384 


389 


-390 


398 


400 


4 09 








414 


423 


426 


-427 


430 


432- 


436 


443- 








444 


450-451 


454 


462 


468 


472- 


477 








480- 


483 


487 


-468 


490-491 


493 


496- 








498 


500 


503 


-506 


509-512 


515- 


516 








519 


521-523 


526 


530 


534 


541 


544 








54 7 


554 


557 


564 


566-567 


572- 


576 






585- 


586 


588 


-589 


595-596 


601 


607 


l 




611- 


612 


615 


619 


621 


623 


626 


630 


J 




632- 


633 


644 


647 


649 


651 


655- 


€56 


! 
i 




660 


662 


-665 


667 


659 


672 


683- 


684 


! 




696 


700 


706 


710 


713 


716 


718- 


719 








722- 


723 


728 


734 


-739 


743 


750 


752 • 








763 


765 


-766 


773 


-778 


784- 


785 


787- 








789 


791 


800 


802 


-803 


809- 


812 


E14 








824 


626 


828 


-829 


832 


838- 


839 


641- 








845 


849 


-850 


852 


-8S5 


657- 


861 


e64 








866 


874 


878 


-880 


882 


887 


890- 


e9i 








897- 


898 


902 


904 


906- 


-907 


910 


916 








918- 


920 


922 


924 


-925 


927 


930- 


932 








934- 


935 


937 


947 


950 


953 


955- 


956 








961 


963 


966 


-967 


969 


971 


977- 


979 








9ei 


984 


986 


-967 


990 


992- 


993 


995 








9S7 


999 


-1001 1005-1007 1009 










1012-1013 1016 


1020 


1022 


-1C24 








1026 1029-1030 


1033 


1038 1041 
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Tissue Origin 



RNA Source 



Kyseq 
Library Name 



SSQ ID NOS: 



1045 


1047- 


-10S0 


1052 


1054- 


lOSi 


1059 


1063- 


-1064 


1067- 


1071 


1073- 


1074 


2078 


1085 


1087 


ices 


2095- 


1097 


1104 


1106- 


1107 


1103 


1112 


1116- 


1117 


1119 


1126 


1134- 


:i3h 


1132 


1141- 


•1142 


1144- 


1145 


2I4S 


1152- 


1153 


1156- 


1158 


1167 


1170 


1172 


1178 


1195- 


1196 


iise- 


1200 


1202 


1204 


1208 


1214 


1210 


121? 


1222 


1227 


1234 


1241 


1247 


1252 


1257- 


1258 


1265 


1267- 


•127C 


1276 


1278 


1280- 


-1281 


1283 


1285 


1286- 


1289 


1295 


1300 


1305 


1308 


1312 


1317- 


1321 


1329 


1338- 


-1335 


1341 


1344- 


1346 


1349- 


1351 


1353- 


135S 


1357 


1365- 


-1366 


1369 


1378- 


1379 


1383- 


1385 


1394 


1397 


140C 


1402- 


1403 


14 08 


1417 


1419 


1423- 


1426 


1431 


1433- 


-1436 


1438 


1444 


144 6- 


2448 


1454- 


-1455 


1460 


1466 


14 66 


1470 


1474 


1480- 


-1481 


1482 


1486- 


1488 


1490- 


-1491 


1494- 


-14«6 


1506 


1508- 


1509 


1511- 


-1512 


1515- 


-1516 


1519 


1523- 


-1524 


1528-152? 


1536- 


1540 


1546 


1549 


-1550 


1555 


1560- 


1561 


1565 


1567 


1569 


1575 


1588 


1591 


3593-1594 


1596-1596 


1600- 


1602 


1608 


1614-1616 


161E 


1620 


1624 - 


1625 


1627- 


-1632 


1636 


163 9 


1644 • 


-1645 


1647 


-1649 


1652- 


-1653 


Iujo 


1662 


1664 


1666-1667 


1 6 7 0 - 


1671 


1673-1675 


1676 


-1679 


2683 


1685- 


1688 


1690-1692 


1696- 


-169S 


1705 


1709 


1716 


-1717 


1722 


1727 


1730 


1735 


1739 


1741 


1743- 


-1744 


1748- 


•1749 


1753 


1760-1762 


176E 


1767 


1770 


-1771 


1773 


1775 


-1776 


1778- 


•1779 


1786 









.Lymphocytes 



ATCC 



LPC001 



4 11-12 18 24-25 30-31 46 50-51 
56-57 68-69 80 92 98 103 105 110 
126 137 152-153 157 165 172 188- 
189 157 203 210 217-21R 222-223 
225-226 229 231 247 251 256 264 
272 280-281 284 300-301 321 325- 
326 339 348 352 357 371 382 384 
390 400 404 412 414 421 423 426- 
427 430-431 445 447-446 451 454- 
455 475 503 516 526-527 530 537- 
540 549 556-560 563 574 577 589 
602 613 615-617 621 623 628-630 
636-637 647 649 657-655 690 697 
717 723 755 764 775-777 780 786 
789-790 793 800 802 822 G38 649 
866 865 876 881-883 852 698 906- 
907 911 921-923 928 S75 990 992 
996 1001 1004-1007 1033 1050 
1054 1078 1107 1135 1140-1141 
1143 1348 1158 1163 3177 119S 
1205 1216 1226 1231 1236 1242 
1244 1250 1258 1260 2265 1269- 
1271 1290-1293 1308 1312 1317 
1319-1320 1339 1345-1346 1348 
1350-1351 1357 1367 1369 1379 
1381 1383-1384 3386-1387 1389 
1394 1397 1405 1423 1425-1426 
1431 1437 1446 1448 1461 1466 
1470 1472 1474 1482 1492 1506 
1528 1537 1546 1549 1591 1596 
1600 1603-1604 1606 1627 1636 
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BNSDOCID: <WO 0153312A1_L? 
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Tissue Origin 



RNA Source 



Hysec 
Library "ame 



SEQ ID NOS: 



1638 164*7-1649 1652 1658-2659 

1664 1676-1677 1680-1681 1687- 

1660 1699 2711 1715-1716 1726 

1726 1737 1740 1746 1748 1752 

1756 1758 2777 1775 



leukocyte 



GIBCG 



L'jcoc: 



3-4 10-11 13 15-18 2 
30-31 35-36 40 43-45 
54-58 60-63 68-65 75 
85 68-D1 93-96 9e 10 
107-108 112 116 219 
234-240 142 147-149 
157 162-163 167 169- 
179 186 190 192-199 
212-215 217-219 222- 
236 247 251 255-258 
274-277 280-201 285- 
307-310 313-314 316- 
330 333-334 340-342 
354-358 370-371 380- 
400 405 40e-4l0 412 
425 430-431 434-435 
442 145-452 453-454 
464 468-472 474-479 
487-491 496 499-501 
513 516-519 522 526 
534 536-540 542 547 
566-567 572 574-577 
586 589 593 595-597 
606-607 613-613 615- 
629 633 636-637 642 
659-660 662-665 667 
678 682-684 692-696 
708 710 716-720 725- 
736-739 743-746 749 
759 765-766 768 770- 
786 788-790 793 796 
803 810-811 814 817 
830 832 834-836 838 
863-864 866-871 877- 
894-896 898 902 904- 
925 927 930-932 935- 
945 948-949 953 955- 
962 964 967 970-971 
985-990 992-993 995- 
1004-1009 1C11 1014 
1022-1023 1025 1027 
1033-1036 1038 1041 
1050 1053-1054 1058 
1062 1064 1068 1070 
1085-1086 2C89-2092 
1106-1107 1110-1113 
1122-1123 2125 1229 
1135-1137 1140-1145 
2163 1168 1170-1174 
1180 1182-1183 1186 
1200 1202 1205-1206 
1219-1221 1223-1227 
1238-1242 1247 1252 
1258 1261-1262 12S4- 
1270 1272-1275 1277 
12B7-1293 2299-2300 
1312-1313 1317-1320 
1330 1333-1335 1339 
1347 1349 1353-1357 
1365-1367 1369-1370 
1377 1379-1381 1386- 
1400 1403 1409 1419 
1428 1430-2431 1433- 
1438 1440-1442 1446- 



0-21 24-25 
48 50-51 
79-80 82-83 
0 103-104 
123 125-128 
151 153 155 
172 174 177- 
203-207 210 
223 529 235- 
260 262 272 
286 297-301 
317 321 325- 
348-349 352 
385 387-388 
414-416 421- 
437 439 441- 
456 459 461- 
481 483-485 
503-504 509- 
527 529-531 
549 553-559 
579 582 584- 
601-602 604 
621 623 627- 
644-650 655 
669 674-675 
698 700 706 
726 729-736 
751 753 756 
778 780 784- 
793 800 802- 
819 826 828- 
843 645-860 
879 881-892 
914 916 919- 
936 941-942 
956 958 960- 
973 975 977 
996 999-1002 
1017-1019 
1029-1031 
1043 1047 
1059 1061- 
1072 1078 
1093 1097 
1115-1117 
1132-1133 
1152 1158 
1176-1178 
1195 1198- 
1211 1216 
1230-1236 
1254 1256 
1265 1269- 
1280-1284 
1306 1308 
1322 1324- 
1341 1343- 
1359-1361 
1373-1374 
1387 1394 
1423 1425- 
1434 1437- 
1448 1450 
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Tissue origin" 



RNA Source 



Kysec 
Library Name 



SEO ID NOS: 



leukocyte" 



1153 

1470- 

1488 

2506 

1521- 

1531 

1549- 

1565 

1594 

1608 

1626- 

1639 

1653- 

1670 

1692 

1711 

1727 

1744 

1762 

1784 



1458. 
1471 
1490- 
1509 
1522 
1534 
1550 
2567 
1596 
1621 
1629 
1641 
1655 
1675 
2696 
1726 
1733 
1748 
1765 
178t 



1459 
14 74 
1493 
1512- 
1524- 
1538 
1553 
1575 
1598 
1614 
1631- 
1644- 
1658- 
•1679 
1700 
-1717 
1737- 
•1749 
1769 



1463 
3477 
1496 
2513 
1525 
1541 
1555 
1580 
1600 
2620 
2632 
2645 
1660 
1684 
2702 
2720 
1738 
2752 
1772 



1464 
1478 
1501 
2526 
1527- 
1545- 
2556 
2589 
1602 
2621 
1636 
1648- 
2662 
•1688 
1707- 
2723 
1741 
1755 
-1772 



1466 

1482- 

1504 

1519 

1526 

1547 

1560 

1591 

1606- 

1624 

1638- 

1650 

1669- 

1690- 

1709 

1725- 

1743- 

1760- 

1781- 



Clontech » LUC003 



4 35-36 44-45 62 68- 
219 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621- 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1213 1119 1170 
2236-1237 2242 2275 
1357 1359 1377 1506 
1553 1592 2600 1613- 
1628 1670 1676-1677 
1699 1733 1738 2772 



melanoma from 
cell line ATCC 
flCRL 1424 



CI on tech 



69 75 82 102 
244 280-281 
455 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



MEL004 



mammary gland 



25 35-36 43 80 
163 166 188-185 
271 277 280-281 
345 351 372 380 
415-416 430 445 
481 490 499 503 
567 575-576 588 
660 665 734-735 
790 800 832 845 
883 887 905 914 
985 990 992 999 
1038 1050 1055 
1099-1102 1107 
1156 1163 1172 
1214-1215 1217 
1238-1239 1244 
1293 1311 1320 
1345 13S5 1367 
1403 1406 1414 
1465 1521 1529 
1547-1548 1582 
1638 1647 1653 
1670 1680-1681 
1724-1725 1731 
1761 



304 126 128 150 
197 210 215 220 
310 317 336-336 
-381 383 387 412 
448 454 456 467 
526 528 546 548 
601 613 615 647 
737 759 778 787 
856 859 869 878 
932 934 958 976 
1000 1025 1031 
1068 1074 1088 
1136-1138 1149 
1190 1195 1200 
1226-1227 1235 
1253 1278*1290 
1330 1334-1335 
1386-1387 1394 
1423 1437 1442 
1536 1539 1541 
1620 1626 1631 
1660 1667 1669- 
1696 1704 1715 
2732 1750 2760- 



Invitrogen 



MMG001 



5-8 10 12 14-18 
33-39 42-43 52 
71 73-74 79-80 
106 108 112 123 
146 148 150-152 
166 170-172 174 
188-190 194-196 
222 224 227-228 
251 253-254 256 
271 276-277 279 



20-21 24 
55-58 60 
82 89 98 
128 133 
154 158- 
176 178 
201-206 
231 233- 
261-263 
-281 284- 



25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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Tissue Oriain 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



309-312 316 
325 331-332 
348 350 356 
371 376 379 
397-398 40$ 
430 

462-164 474 
488 490 494 
512 516-517 
534 $37-541 
572-574 587 
618 623 628 
647-648 650 
665 €67 669 
688 695-696 
720 722-730 
747-748 750 
780 784 
809 614 
854-658 
878 881 
911 916 
946 948-949 
963 965-966 
993-997 
1008 1013- 
1025 1027 
1043 1045 
1068-1075 
1089-1091 
1112-1119 
1136-113'? 
1148-1149 
1172-117? 
1196-1155 

-1218 1222- 
1240-124: 

-1259 1263 - 
1285-1286 

■1320 1323- 
1342-1345 
1359 1369- 
1383-1384 
1421-1423 
1431 1434- 
1454 1457 
1480-1485 
1505 1507 
1532 1534 
1549-1550 
1567 157; 
1587-1586 
1601-1602 
1616 1616 
1631 1635- 
1647 165C 

-1658 166C 

•1671 1673- . 

•1685 1689- 

-1715 1719- 
1732 1736 

■1747 1749 
1765-1768 
1779 1783- 



2S0 731 299 3C1 304 
320-32- 323-325 327 
334 339 341 344-345 
359-360 362-363 268 
303 380 390 393-395 
4C6 412 414-415 423 
441-444 448 451-455 
476 479 462 4e5-486 
455 498 503 506 509 
519-520 522 527 529 
•547 549 554 557 562 
589-591 557 602 607 
629 632 634-640 644 
652 655 657-658 660 
672 674-676 679 682 
706-707 710 713 717 
732-734 736 738 743 
755 759 761 766 770 
789 794 0C3 606-807 
822 827-829 637 842 
864 866 869-670 872 
893-900 904 906-907 
921-923 926 935-937 
953-954 957 960-961 
970 977-976 984-989 
100C-1001 1O05-1OC6 
1014 1016-1017 1023 
1032-1033 1036 1039 
1055 1057-1058 1063 
1077-1078 3085 1087 
1095-1102 1107-1108 
1121-1123 1131-1133 
1139-1142 1144-1145 
1153 1159 1167 1170 
1183-1185 1190-1192 
1207-1208 1212 1216 
1223 1225 1231 1234 
1247 1253-1254 12S8 
1262 1270-1280 1283 
1298 1307 1314 1316 
1325 1330 1334-1335 
1349-3352 1354-1355 
1370 1377 1379 1381 
1389 1405 1414 1419 
1425-1426 1428-1429 
1437 1439 1448-1449 
1460-1464 1466 1471 
1487 1489-1491 1493 
1512 1519 1526-1528 
1536 1539 1542 1547 
1554 1561-1562 1564 
1576-1579 1581-1582 
1592 1594 1596-1597 
1607-1608 1630 1612 
1621-1622 1625-1626 
1636 1641 1643-1644 
1652 1654-1655 1657 
1662 1664-1666 1669 
1674 1676-1677 1680 
1692 1701 1706 1713 
1720 1723-1728 1730 
1740 1742-1744 1746 
1751 1753 1760-1762 
1771 1774 1776-1777 
1784 178f 



706- 
817- 
863- 
889 
919 



induced neuron 
cells 



Strategene 



NTD001 



29 35-36 80 116 123 

214 230 280-281 284- 

330 340 358 371 375 

422 424 492 497 532- 



156 163 181 
285 307 321 
377 380 382 
533 542 546 
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Tissue Origin 



RNA Source Hyseq 

I bibrsry Name 



SEQ ID NO£: 



549 566 586 595 612 
73< 775-778 780 792 
856 858 875 936 953 
2041-1043 1055 1072 
1194 120G 1223 1246 
1286-1289 1291 1294 
1249 135S 1412 1423 
1623 1645 1684 1705 



"645-647 654 
799 821 826 
965 990 S92 
1104 1193- 
1253 1274 
1311 1320 
1485 162C 
1715 1751 



retinoid acid Strategene 

induced 
neuronal cells 



neuronal cells 



NTROOl 



5-fc 78 268-269 277 383 431 506 
623 677 731 999-1000 1199 1425- 
1426 154"/ 



Strategenc 



NTUO 01 



29 65-66 80 82 
166 174 iei-185 
284 309 325 332 
39: 393 406 414 
470 488 503 506 
540 572-574 597 
661 700 702 716 
904 948 954 977 
1025 1064 106$ 
1219 1226 1234 
1295-1296 1311 
1330 1350 1355 
1383-1384 1400 
1535 1547 1578 
1690 1738 1749 



110 1 
198 
334 
-416 
510- 
602 
743 
1000 
1122 
1246 
1317- 
1365- 
1412 
1647 
1783- 



19 146 152 
227-228 253 
336-338 375 
454 465-466 
512 519 537- 
607 623 647 
771 792 858 

1005-1006 
i:48 1185 
1271 1283 
1320 1329- 
1366 1378 
1445 1505 
1656 1683 
1784 



pituitary 
gland 



CI on tech 



PIT004 



311 314 379 408 415 430 454 1055 
1095-1096 1272-1273 1312 1320 
1376 1652 1671 1720 1725 1736 
1741 1755 



placenta 



CI on tech 



PLA003 



5-8 124 208 277 370 843 906-907 
1280 1317-1319 1359 1609 1623 
173 7 



prostate 



Cl ontech 



PRT001 



9 46 57 71 107 147 171 177 197 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 674 690-891 905 938 945 963- 
964 938-989 1002 1025 1033 1045 
3061 1095-1096 1112 1125 1142 
119€ 1198 1202 1232-1233 1241 
1256 1272-1273 1287 1255 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1469 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1686 
1717 173B 1743-1744 



rectum 



invitrogen 



REC001 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 S89 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 738 748 750 
756 762-763 766 770 774 790 819 
825 643 849 851 881 903 909 946- 
949 S60 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-1335 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Oricin 


RN*A Source 


Hyseq 


SEC ID NOS: | 




Library Name 


i 


i 




:<426 1436 1439 14S9 1474 1471 






246;. 1546 1587-1588 1592 1596 






HjO 1622 1627 1644 1658 166^ | 


1 




j865-1666 1669 1675-1677 174.<- i 

■ 76 C 


salivary gland 


Ciontech 


SAL001 


3C 55 97 103 110 140 149 152 158 I 






2S( 217-218 242-243 256 301 308 


! 




321 333 3si 354 360 410 427 


i 




448 473 487 494 496 501 535 555 






SfS-570 £72-573 590-551 624 636 






r.L : : 759 762 764 768 771 788 800 


i 




60S- 826 848 865 879 506-907 925 


! 




521- 963 1016 1020 1025 1040 1C46 


! 




3C55 1066 1103 1150 1172 1183 






3^34 1281-1282 1288-1269 1298 


1 




3315 1320 1333 1336-1337 1348 








3259 1373 1379 3424 1447 144? 








3474 1482 1492 1494 1498 1513 








3523-1524 1537 1554 1596 1626- 






3627 1636 1652-1655 1658 166S 


1 




3671-16/2 1691-1692 


salivary gland 


Ciontech 


SALS 03 


158 326 1423 1463-1464 


sk i n 


ATCC 


SFB0C1 


2 1-20 1400 


fibroblast 






skin 


ATCC 


SF3002 


262 736 1025 1253 


fibroblast 








skin 


ATCC 


SFB0C3 


705 1119 1350 1631 1653 


fibroblast 








small 


Cjontech 


SIN001 


2b 142 146-147 151 155 198 202 


intestine 






244 260 271 280-281 286 288 298 








3C-J-302 308 312 334 340 371 398 






408 412 414 416 423 425-427 430 






434-435 445 452 454 478 503 516 






S3 9 521 523 543 547 549 555 559 






S63 £69-570 585 592 604 611 626 


I 




£26-629 632 650 659 681 710 714 








716 750 764 780 798 829 842 857 








859 866 887 892 894-895 903 904 








9C6-907 912 919 935 997-998 1000 








3007-1008 1026-1028 1044 1055 








1089 1097 1116-1117 1131 1148 








1169 1199 1219 1234 1247 1264 








2279 1316 1320 1326 1341 1342 








1349 1351 1374 1387 1398 1400 








3403 1407 1423 1428 1468 1498 








2 501 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 1656 








3hfc2 1671 1675 1684 1691-1692 








S704 1711 1717 1719 1722 1725 








3726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 


skeletal 


Ciontech 


5KM001 


38 20-21 82 84 101 118 134 148" 


muscle 






35: 153 166 225-226 256 274 277 


! 




289 329 361 412 414 424 440 452 








4^9 470 488 503-504 537-540 647 








660 673-675 715 773 780 786 83C 








SCE 522 950 963 982 99C 952 1020 








3047 1063 1115-1117 1121 1134 








1226 1268 1284 1298 1321 1325 






1226-1337 1343 1409 1413-1414 


1 




3509 1599 1624 1644 1653 1712 


skeletal 


Ciontech 


SKM002 


368 1683 1712 


muscle 








skeletal 


Ciontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Ciontech 


SKMS04 


225-236 


muscle 








spinal cord 


Cj ontech 


" " spcooi 


4 S 11 17 30-31 35-36 43 46 60 
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Tissue Origin j RNA Source ; Hysec 

I Library Name 



SEQ ID NOS: 



£2 BS 92 94 aoe HO 
167 198 204-205 210 
25$ 277 280-283 300- 
217 372 379 387 392 
43C 433 448 467 473 
513 519 524 526 
549 551 559 567 
516-617 623 



adult spleen 



505 
547 
60" 



625 



f£2 657-658 670-671 
682 709 711 715 719 
743-750 753 775-777 
805 820 832 834-836 
655 858 861 864 871- 
89R 906-908 917 919 
94'. 970 985 990 992- 
1039 1053 1059 1065 
1077 1082 1085 1097 
UlG-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
13S6 1359 1368 1375 
1407 1423 1429 1437 
14B4 1470 1482 1492 
1511 1529 1538 1548 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



136 139 157 
215 229 25C 
302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 5S3 
637 649-65G 
673 679 6£i- 
726-729 734 
781 789 751 
847-849 854- 
872 875 884 
924 934 942 
993 998 1013 
1072 107b 
1103 1105 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 144fe 
1501 1508 
•1549 1565 
1614 1625 
1651-1652 
1751 175^ 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



CI on tech 



SPLcOl 



stomach 



Clontech 



STO003 



10 15-16 61 68-69 100 117 149 
197 201 227-228 231 249 273 2B0- 
281 287 291-292 302 312 358 362 
42^-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 735 
760 782 785 846 919 960 964 966- 
967 S76 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-123' 
125S 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1476 
1467 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



Clontech 



thalamus 



THA002 



9 1 
ISO 
239 
333 
388 
477 
60B 
774 
899 
103 
115 
119 
130 
144 
156 
161 
168 
175 



1 25 85 87 112 137 146 180 
198 206 210 212-213 235-236 
261 268-269 279 290 301 32S 
-334 341 351 356 364-365 379 
393 396 419-420 441-442 456 
483 508 525 531 549 567 606 
-609 647 681 715 725-727 736 
762 784 794 827 883 890-891 
-500 961 997 999-1001 1004 
4 1055 1097 1129 1144-1141 
0-1151 1157 1172-1173 1177 
1194 1208 1220 1249 1280 
1355 1369 1434- 



5 1345 1355 1369 1434-143E 

0-1441 1454 1496 1546 1549 

2 1572 1578 1590 1594 1613- 

4 1640 1651-1652 1671 1687- 

8 1703 1743-1744 1746-1747 



44-45 54 57-58 62-64 79 104 123 

32£ 134 153 193 212-213 218 242- 

243 258 274 277 279 257 301 307 

327 330 333 342 351 358 371 410 

430 445 465-466 468 471 483 487 

493 503 506 509 517 526 535 537- 



chymus 



Clontech 



THM001 
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Tissue Origin 



RNA Source 



Kysec 
Library Name 



SEO ID NOS: 



540 546 546 554 567 
591 604 612 621 636- 
649 656 660 665 6*70 
72B 735 739 746 759 
775-777 780 784-785 
624 826 828 845 851 
866 870-871 87B 884 
900 927 930-931 967 
992 999 1014 1029-20 
1066 1073 1103 1107 
1117 1119 1140-1142 
1172 1177 1195 1206 
1216 1218-1219 1221- 
1271 1277 1282 1320 
1367 1369 1383-1364 
1423 1425-1427 144S 
1493 1536 1554 1620 
1549 1654-1655 1661- 
1670 1674 1676-1677 
1707 1711 1731-1732 



584 586 590- 
640 645-647 
698 710 720 
762 766-767 
800 602 809 
856-659 864 
887 892 899- 
983 586 990 
30 1033 1059 
1113 1116- 
1158 1163 
1209 2213 
1222 1227 
1329 1349 
1417 3419 
1477 I486 
1644 164G 
1652 1669- 
1685-1686 
1737 



thymus 



Clontech 



THMC02 



5-9 15-21 25 33 35-36 43-45 48 
50-51 54-55 60 75 &3 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 286 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 36C 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 454-456 461 
464-467 470 472 474-476 483 4 88 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 63C-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 7C3 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 B70-671 882 
890-B93 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
100B 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1142 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1283 1284 129C 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
| 1392 1397 1400 1402 1406-1407 
I 1417 1423 1425-1427 1440-1441 
} 1466 1474 1477 1483 1493 1498 
! 1504 1506 1525 1536 1545 1549 
j 1566 1554 1598-160C 1606 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1672 1672 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 
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Tieeue Origin 
thyroid ajand 



Source 



CI on tech 



!;ysec 
Library Name 



SEG ID NOS: 



7HR001 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-63 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155-158 163-164 1G6-169 171 
166 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-263 277 280-281 
284-286 288-285 298-299 302 305- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
467-488 490-454 496-497 500-501 
503-504 506 509-513 516-S17 519 
524 526-527 529 535-540 547 549 
562 564 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 G9B 700 709 721 
727-729 732 734 73fi 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 823- 
824 826 828 833 838 B41-845 847 
849 857-860 867 874-875 878 B8C- 
881 887-868 890-892 894-895 898 
908 910-911 913-514 922-923 926- 
927 929 932-934 537 939 941-942 
948 953 957 961 963-964 966 978- 
979 981-982 937 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192 
1197 1200 1204 1208-1209 1214 
1217 1215 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1285 1295 3306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1354 1407 1419 1428*1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1475 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1550 1553 1555- 
1559 1562 1567 1578 1590-1591 
1597 1559-3601 1612 1614 1616 
1619-1620 1622 1524-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1676-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-173e 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 



trachea 



Clontecn 



TRC001 



9 29-31 46 48 87 104 107 110 135 
158 222 262 266 286 301 318 331 
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Tissue Oricin 



RNA fourcf 



Hysec 
Librarv Name 



SEQ ID NOS: 



352 372 377 
454 472 474 
£93 597 607 
810 859 666 
S22 932 935 
1102 1113 
1237 1221 
1414 1424 
.1569 1579 
1667 1671 
1692 1711 



384 414 424 445-446 
491 496 560 579 588 
612 626 6ei 702 719 
B78 894-895 912 916 
1046 1075 1080 1099- 
208 1215 1232-1233 
312 1385 1387 1405 
430 1437 1447 1505 
586 1600 1641 1653 
676-1677 1683 1691- 
717 1726 1772 



uterus 



CI on tech 



UTR0C1 



17 19 25 41 
108 139 252 
263-265 274 
446 446 452 
506 513 519 
560 601 610 
773 780 833 
929 934 937 
1050 1075 
1258 1279 
1343-1344 
1478 1481 
1552 1579 
1626-1627 
1719 1722 



46 57-58 
174 198 
290 387 
473 491 
522 526 
632 659 
845 857 
996 1009 
107 1124 
287 1310 
375 1437 
498 1519 
597 1602 
649 1652 



61 85 104 
200-201 206 
408 420 436 
493 499 503 
530 542-543 
665 720 751 
872 877 912 

1011 1018 
117C 1219 
1320 1323 
1451-1452 
1521 1536 
1606 1620 
1661 1670 



723 



TRADOCS: 14 1 61 91 .1(%CQN01 !.DOC) 
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TABLE 2 



SEO 
ID 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 

DrU 

SCORE 


r 

IDENTITY • 

I 




Y4 1736 


sap^en c 


o u -lien* riw/iiii 

seouence . 


13S6 


10C 


— : — 


Y66656 


Hon" : c 
sapi en.'. 


Meriubrane -bound orotein 
PRC943 . 


2389 


99 ~~ . 

j 




AF113136 


Homo sapiens 


IL~l receptor -asscciatec 
kinase-M; 1RAK-M 


3043 


100 j 


4 


AF01780* 


Mus mus cuius 


Zn-15 transcription iactoj 


6351 


77 i 




X02761 


Homo sapiens 


fibronectir. precursor 


1053b 


98 j 




X02761 


Home sapiens 


fibronectir. precurso 


8990 


89 1 


c 


X02761 


Home sapiens 


fibronectin precursor 


12564 


99 ! 


c 


AJ01167? 


Borne sapiens 


Rab6 GTPase activating 


5251 


99 


10 


VJ88501 


Home sapiens 


Human stomach carcinoma clone 
HP10415-encoded protein. 


2381 


100 


1j 


AF117754 


Homo sapiens 


thyroid hormone receptor* 
associated protein complex 
component TRAP24C 


11336 


90 






Homo sapiens 


QuHbttii. . h movei protei.. 
similar to ANX3 (ankyrin 3 , 
node of Ranvier (snkyrir. 


Dab 


i"nn 

1UU 


13 


Y50620 


Home sapiens 


Protein regulating «enc 


1894 


98 


1 4 




^ 

sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


1 00 


1 C 


Ht /. J J H 3 J 


Home sapiens 


DROl* 1-SL-Ck nynrai n DDVPHD1 

KrAV.ri- 1 1 xe procei n fkav^dJt j 


3124 




1', 


AF2013O3 


Homo sapiens 


dhtr oribeta- binding protein 


3130 


98 


It 


AF064205 


Home sapiens 


dynactin l pl50 isoiorm 


6377 


100 


1 F 




Saccharomyce 
6 cerevisiae 


Ynri21wp 


174 


2 6 


2C 


ABU3 290 J 


Homo sapiens 


guanos ine monophosphate. 
reductase isoloa 


1 801 


99 


2 3 


ABU3 2303 


Home sapiens 


guanos ine tr.onophosphatc- 
reductase isolog 


14 85 


99 


21 


Ar 14050 / 


Homo sapiens 


Ca2i /calmodul in- dependent 
protein kinase kinase beta 


3 083 


O Ci 


23 


Al'i 4 JbU / 


Homo sapiens 


Ca2+/calmodul in-dependenL 
protein kinase kinase beta 


<co VV 




2 £ 




I1U1IIU £>C1 L/ J. Lllb 


hriTi/^r/*! ^ 1" i n tL — O. 

cj JUiicix 1. in h—\j- 

cnl ■f#-»f ra ti es f o v A «se> 

5UJ.1UH OJIQICX (JSC 


2213 


99 


2E 


U33460 


Home 
sapienp 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 


-2 ( 


Y44 4 88 




■Mv^KJrJUivz VtAZJ-aiiL. ijxvJLej.il. 


13 87 


100 


2"/ 


U43 701 


Komo sapiens 


ribcsomal protein L22a 


791 


100 


2 6 


U02032 




- locsoiiitdj^ protein ju^._:e 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by Qene 17 clone 
HNFIY77. 


1083 


99 


30 


W71749 


nuiiLu jcjJi cub 


Human 1 1 V\ -i rn ii t i ri rnninnat 1 rtr> 

system protein 2 . 


715 


90 


3 j 


W7 l 149 


U nmn c Si r> i Arte 


/l U titer J J UJUX ytij. L J iJ i^L^iJ J LlUJi 

system protein 2 


631 


82 


32 


AF231917 


Komo sapiens 


long-chain 2-hydroxy acid 
oxidase HA0X2 


1811 


100 


32 


Z29481 


Home sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1507 


99 


34 


AB0C1451 


Homo sapiens 


Sck 


2865 


100 


35 


Y00644 


Homo sapiens 


to 287) 


1667 


99 


3€ 


Y00644 


Home sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


9B 


37 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


3586 


78 


36 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


4726 


99 
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TABLE 2 



SEQ 1 ACCESSION j SPECIE- 1 
ID ! NUMBER j 
NO: | ' 


DESCRIPHON 


SMITH- > V — ! 
WATERMAN | IDENTITY 
SCORE l ' 


29 \ Y787S5 

1 


Homo sapiens 


Human antizuai-2 <AZ-2) amino 
acid secruence . 


3556 


77 


40 | U9312I 


Homo sapiens 


M-phase pnosphoprotein- J 


3747 


100 


42 


Y427SO 


Homo sapiens 


Human calcium binding protein 
l icaBP-1) . 


795 


100 


42 


AF282626 


Homo sapiens 


iatexin 


1185 


100 


4? 


G0215C 


Komo sapiens 


Human secreted protein, SEQ 
ID NO: 6231. 


384 


94 


44 


U19617 


Mus museums 


Elf -2 


2724 


8 8 


45 


U19627 


Mus mus cuius 


Elf-2 


2062 


86 


4e 


AF100756 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87592 


Komo sapiens 


Human SPROUTY-2 protein, SEQ 
ID NOr24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160} 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


Rattus 
norvegicus 


rab- related GTP-binding 
protein 


1089 


96 


S3 


L317B3 


Mus musculus 


uridine kinase 


917 


72 


54 


X83973 


Homo sapiens 


transcription factor 


4 4 86 


98 


5E 


AF224742 


Homo sapiens 


chloride channel protein 7 


4128 


99 


5fc 


W7480S 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


57 


ZS09D7 


Komo sapiens 


Human 7BC-1 cDNA from second 
transcript , 


4824 


10C 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma tium vinosum. 


60B9 


99 


59 


D79994 


Homo sapiens 


similar to ankyrin of 
Chromatium vinosum. 


4014 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


63 


AB032069 


Homo sapiens 


protein containing CXJCC 
domain 1 


1390 


100 


62 


Y66660 


Home 
sapiens 


Membrane -bound protein 
PR0783. 


2492 


99 


63 


Y66660 


Home 
sapiens 


Membrane - bound protein 
PR0783. 


1709 


99 


64 


S70012 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF139516 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1306_1 clone 
secreted protein. 


157 


30 


67 


AJ24 573 8 


Homo sapiens 


claudin-15 


1206 


100 


66 


AF099136 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4183 


87 


65 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4906 


86 


70 


282055 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPAi protein 


1282 


10C 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human M5K2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein seguence. 


1207 


100 


75 


AF188622 


Mus musculus 


^ _ ~ 5 : 

selectively expressed in 
embryonic epithelia protein-1 


14 85 


74 


76 


AE000406 


Escherichia 
coli 


putative DNA topoisomerase 


9S0 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


7S 


AF129756 


Homo sapiens 


G4 


1554 


99 
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TABLE 2 



SEQ 1 
ID [ 
NO: 


ACCESSION ; SPECIES 
NUMBER 1 


DESCRIPTION 


SMI TH- 
EATER KAN 
SCORE 


IDENTITY 


80 


Ab096768 I Homo sapiens 


dJ658E16 .2 
(phosphatidyl serine 
decarboxylase (FSSC, EC 
4.1.1. 65) } 


2033 


100 


81 


AL096766 | 

i 


Kotuo sapiens 


dJ858El€ .2 
(phosphatidyl serine 
decarboxylase (PSSC, EC 
4.1.1.65)) 


12 2 C 


96 


82 


XS7351 


Homo sapiens 


1-8D 


677 


9 fc 


83 


AC0 05594 


Home sapiens 


R26984_ 1 


270C 


98 


84 


X73113 


Home sapiens 


fast My3P~C 


5959 


99 


85 


AF09733C 


Homo sapiens 


HI chloride channel; p64Hl; 
CtiIC4 


1305 


99 


86 


AB01B423 


Mus musculus 


SH2 domain- containing protein 


1360 | 78 


87 


AF272151 


Home sapiens 


adaptor protein CIKS 


3064 j 59 


88 


AF196325 


Home 
sapiens 


triggering receptor expressed 
on monocytes I 


1214 | 100 

! 


89 


AB01687S 


Arabiriopsis 
thaliana 


contains similarity to pre- 
mRNA splicing 
factor~gene_id:MRE17.2 . 


634 


36 


90 


AJ133721 


Mus musculus 


homeodomam protein 


654 


57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


6: 


92 


A61971 


unidentified 


MCSF 


11676 


95 


93 


Y59365 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 86. 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8 . 


1031 


100 


95 


AF227741 


Kattus 
norvegicus 


protein kinase V,'NK2 


2426 


95 


96 


AF227741 


Kattus 
norvegicus 


protein kinase WNKj 


19CO 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein; 


3423 


100 


99 


AC00S733 


Homo sapiens 


R33083_. 


1974 


99 


100 


Y95293 


Homo sapiens 


Human GEF containing NEK-like 
kinase substrate sGNK. 


4 092 


99 


| 101 

( 


ALII 8501 j Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


1509 


10C 


102 


AJG06267 


Homo sapiens 


ClpX-like protein 


3233 


IOC 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein AUP1 


204 2 


96 


104 


AB015982 


Homo sapiens 


serine/threonine kinase . 


4711 


100 


105 


AF151074 


Konio sapiens 


HSPC24C 


83 j 


64 


106 


M35522 


Cam s 

familiaris 


GTP-bindmg protein (rab7) 


354 


50 


107 


R99600 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells . 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH- cytochrome b5 reductase 
isof orir. 


1290 


93 


109 


AC005614 


Homo sapiens 


F23269_^ 


3369 


99 


110 


AF064729 


Homo sapiens 


RAN binding protein 16 


328 5 


100 


111 


X52425 


Homo sapiens 


interleukin 4 receptor 


4496 


100 


112 


Y41686 


Home 
sapiens 


Human PR0274 protein 
sequence . 


228r 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERK1 . 


199; 


100 


114 


Y71071 


Homo sapiens 


Human membrane transport 
protein, wiKr" 10 . 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3,3 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evectin-2 


1124 


90 


117 


W30891 


Kcrno 


Human cytostatin III protein. 


715 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORF 


is 

IDENTITY 






sapiens 






he 


AF:i661G 


Homo sapiens 


PRO103fc 


1469 1 100 


119 


Y08915 


Homo sapiens 


alpha 4 protei:. 


174 6 | IOC 


12C 


AF098070 


Drosophila 
melanogaster 


Li si homoloc 


192 ] 39 


121 


AF0S2422 


Homo sapiens 


katanin p8C subunii 


18: | 37 


122 


Y70743 


Homo sapiens 


PSEQ-i protein encoded by 
NSEQ gene associated with 
matrix remodelling . 


2637 


9B 


122 


AF06324 6 { Homo sapiens 


t HSPC026 


2132 


100 


124 


Y270S6 


Homo sapiens 


Human viral receptor protein 
1ACVRP) . 


833 


99 


125 


._. _ 

MS 31 05 


Lei shmania 

major 


glycoprotein 96-92 


X 1 £ 


27 


126 


U7S467 


Drosophila 
melanogaster 


Atu 


935 


36 


127 


Z6B220 


Caenorhabdit 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF09S927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


123 


W92ySf 


Homo sapiens 


Human 2sin<,4 protein. 


462 


100 


130 


AF12539; 


Lactobacilli! 
s sakei 


ribokinaoe RbsK 


soe 


37 


131 


X9349E 


Homo sapiens 


21 -Glutamic Acid-Rich Protein 


125C 


100 


132 


X93498 


Homo sapiens 


-1 C j. U l,CHHl V* r\ J U A 1 V^I I r i ULwlIJ 


916 


87 


132 


W52B13 


Homo sapiens 


H vim An DR7 /ZiPRP - 1 S W p> nrnhAin 

* j Ull to 1 i / ^AL>Dr 4- *1 A. \3T £j JL \J c ^ ii 

(DBIH) . 


70b 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA - associated 
protein. 


3230 


100 


135 


N€91Sj 


Homo sapiens 


non-muscle myosin P 


189 | 20 


136 


W74 662 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 

* in. v r ±jg o * 


480 | 100 

i 


137 


W7820C 


Homo sapiens 


Human secreted protein 

HHGAU81 . 


655 


99 


13B 


AL033520 


Homo sapiens 


dJ349A12.1 (similar to 
KIAA0701 protein) 


424 


39 


139 


AF02 0261 


Santalum 
al bum 


crcline rich protein 


119 


30 


140 


X70294 


Homo sapiens 


zinc f inger protein 


1634 


100 


14 J 


Y064 39 




Human protease HUPM-8. 


936 


100 


142 


Z6B493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


365 


42 


143 


AB0181O7 


Arabidopsis 
thaliana 


ADP-ribosylatlon tact or- like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


58C 


51 


145 


Y84902 


Homo sapiens 


A* .human oroliieration and 
apoptosis related protein. 


48C 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


2C 


147 


AC007357 


Arabicopsis 
thaliana 


F3F19.18 


647 


31 


148 


W752S5 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


AF05 64 90 


Homo sapiens 


cAMP- specific 
phosphodiesterase 8A 


3710 


99 


150 


Y58I71 


Homo 
sapiens 


Human hydrolase homologue 
KKH-7. 


785 


99 


151 


U10397 | saccharomyce 
I s cerevisiae 


Yhrl46wp 


515 


5 3 


152 


X734 7 8 | Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL04 9697 | Homo sapiens 


du3821i0.5.i tnovel protein 


2034 


99 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION i SPECIES 
NUMEER 1 

i 


nF«;r»? 1 PTT ON 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


i ; 


similar to arginyj - tRNA) 






154 


AF169602 j Hcmo sapiens 


cytochrome b5 reductase b5R.2 


145: 


99 


15S 


X947CI- J Homo sapiens 


rab2F 


322c 


59 


156 


Y25716 i Homo sapiens 


Human pprrprpri nrnt'fMn 
encoded trom oene 6 


24 7; 


100 


15f 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


937 


100 


1 cc 
i-Oi? 


Y17246 


Homo sapiens 


Human protein kinase 
inhibit or-2 (PK1-2; . 


363 


10C 




J0497C 


Homo sapiens 


carboxypept idase M precursor 


239: 


10C 


1S3 


W5404C 


Homo sapiens 


oi'nt'pi n PTF1 


4 8': 


98 


162 


AL022724 | Homo sapiens 

| 


dJ4l3H6.1.1 (hamster 
Protein LIKE PUTATIVE. 

r>rot pin! (i^oform 1} 


135; 


100 


163 


AF125535 


Home sapiens 


pp21 homDloc 


193 


45 


164 


G03632 


Homo sapiens 


Human secreted protein/ SEC 
ID NO: 7713 . 




f / 


165 


AJ25C839 


Homo sapiens 


bei jne/ u»irconine pi DLein 
kinase 


1441 


71 


166 


L09649 


Zymomonas 
mobilic 


zm^ 


17"' 


37 


167 


Y73337 


Homo sapiens 


HTRM cione 194 4530 protein 
sequence . 


1204 


100 


168 


W8B646 


Homo sapiens 


Secreted protein encoded by 
gene uz cione nu^r t_ / j. . 


1084 


100 


169 


AF214733 


Homo sapiens 


ATP- dependent RNA helicase 


4402 


100 


170 


AE000e71 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 f 


171 


Y276B4 


Homo sapiens 


, 

Human secreted protein 

encoded by qene Nc. 118. 


821 


100 


172 


AF22604 4 


Homo sapiens 




2904 


100 


173 


AC245946 


Komo sapiens 


neuroglobin 


775 


100 


174 


P4394S 


Homo sapiens 


inis cene j.5 novej. • 


320: 


100 


175 


Y0792? 


Homo sapiens 


GT?-bindmg protein 


1205 


100 


176 


ill n n 1 1 c 

WSU33 fc 


Hcmo 
sapient 


Human DPI homologue protein. 


QC.C. 

zfot 




177 


Y4167E 


Homo sapiens 


Human channel -related 

mr\l prtil b VJfDM- 1 
wlOJ- trCTUl e nLlul J . 


J. JL^w 


100 


2 78 


Y41674 


Homo sapiens 


Human channel - related 

molecule HCSM-2 


93£ 


99 


179 


AF2204S2 


Homo sapiens 


lc>"ii*>rin**l - "i i Wj* vine* f inQPir 

protein HZF2 


410C 


99 


18C 


X03064 


Homo sapiens 


Clq B-chain precursor 


124C 


100 


181 


U5734 4 


Kus musculus 


We i s 3 


1813 


89 


183 


U5734 4 


Kue musculus 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Meis3 


107C 


86 


185 


AF033120 


Homo sapiens 


rici rpnulfltprl PA76-T2 nit r* 1 psr 

protein 


138<r 


58 


166 


AF2C0357 


Mus musculus 


pantothenate kinase 1 beta 


160i 


82 


187 


W7505P 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1186 


99 


188 


AJ2S2S29 


Homo sapiens 


suppressor ot sterile four 3 


2424 


100 


190 


X54134 


Homo sapiens 




370^ 


100 


191 


Y22203 


Hcmo sapiens 


Human calcium-binding 
phosphoprotein, CSPP-l, 
protein sequence . 


1082 


99 


192 


W63692 


Hcmo 
sapiens 


Human secreted protein 12. 


1975 | 10C 


193 


W8777I- 


Homo sapiens 


Human scrum glucocorticoid- 
reculated kinase (K-SGK2) 
polypeptide . 


260B j 33 
\ 
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SEC 1 
NC : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERt4AN 
SCORE 


IDENTITY 


194 


AF084259 


Mus musculus 


Dromoaomam- containing 
protein EP75 


693 1 54 

! 


l$t 


Y00752 


Rattus 
norvegicus 


serine dehydratase (AA 3 
327) 


994 j 61 

I 


19fc 


til Cl C ' 1 A V. 

W 9 D 3 4 


Homo sapiens 


Human toetal brain secretec 
crotein fhl70 7. 


259t 


100 


197 


ABU2oob3 


Home Gapiens 


hD j 9 


189C | 100 


196 


W95633 


Homo sapiens 
* 


Hcnio sapiens secreted protein 
gene clone hm236_l . 


1614 | 100 

i 


195 


Y44277 


Home 
sapiens 


Human nucleic acid methylase- 


209(- 


99 


200 


AB030039 


Homo sapiens 


hFACPLi 


225( | 100 


20a 


X54162 


Homo sapiens | 64 Kd autoantigen 


291f 


99 


202 


G02061 


Home sapiens 


Human secreted protein, SEQ 
ID NO: 6142. 


5S£ 


99 


203 


X1388S 


Nicotiana 
tabacum 


extenein <AA 1-620) 


185 


33 


204 


004204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos r.auruE 


32 kd accessory protein 


1101 


100 


207 


Y872e3 


Home sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60 . 


131fc 


10C 


206 


Y02B6C 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


am 21 se 9 


Homo sapiens 


□J1076Ei7.1 (KIAAC823 protein 
(ccntinues in AL023803)} 


694 


54 


21C 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


21a 


X66295 


Mus musculus | Clq C chair. 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiquitin-conjugating enzyme 
UbcH2 


966 


100 


213 


Z2S326 


Homo sapiens 


Ubiguitin-conjugating enzyme 
UbcH: 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Komo sapiens 


member of DEAD box protein 
family 


3933 


10C 


21b 


AF25DS58 


Homo sapiens 


claudin-2 


1169 


99 


217 


70,021453 


Homo sapiens 


dJ821Dll.l (PUTATIVE protein) 


259 


100 


216 


Y08565 


Homo sapiens 


UDp-GalNAc polypeptide N- 
acetylgalactosaminyl transfers 

se 


3331 


99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabiccpsis 
thaliana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative prolme-trna 
synthetase 


611 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 


223 | X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224 j AL0356S9 


Homo sapiens 


dJ979Nl.i {dJ979Nl.l) 


5199 


98 


225 | AB032401 


Mus musculus 


mmDj4 


1763 


92 


226 


AB032401 


Mus musculus 


mmDj 4 


1986 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


112 


26 


228 


X83S02 


Saccharomyce 
s cerevisiae 


J1007 


79 


25 


229 | AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


23 0 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
P30826. 


982 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Komo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human cyclin Bi . 


2218 


99 


234 


| Y53762 


Homo sapiens 


A GTP-binding polypeptide 


1017 


100 
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TABLE 2 



PCT/US00/34263 



SEC 
» ID 
1 NO: 


ACCESSION 
NUMBER 


SPECIE5 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


? 1 

-» 

IDENTITY 








designated RAO. 






j 23< 


ZSC74S 


Homo sapiens 


yeast sds22 homolcg 


1800 


10C 


| 23( 


£50749 


Komo sapiens 


yeast sds22 homoioc 


17 54 


98 


; 23^ 


A&C26491 


Home sapiens 


PICK! 


2137 


100 


\-m — 


AJ270205 


Entodinium 
caudatun. 


putative 

phosphatidylinositcl -4 - 
phosphate 5-kinase 


114 


37 


1 235: ■ 


AEC30189 


Kus musculus 


contains transmembrane {TM; 
region and ATP binding region 


710 


93 


1 24C 


W56 538 


Homo sapiens 


Human hedgenog interacting 
protein (HIP) . 


3785 


99 


I 


W5653 8 


Hcmo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF3 55107 


Homo sapiens 


NY -REN- 3 7 antigen 


996 


99 [ 


24> 


AF155107 


Homo sapiens 


NY-REN-37 antigen 


1005 


100 i 


1 244 
1 


AL031320 


Homo sapiens 


dJ2 0N2.1 {novel protein 
similar to yeast a.nc 
bacterial cytosine 
deaminase) 


763 


92 : 
i 

| 


24b 


U3 7C26 


Rot tus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 | 


24 fc 


AL07B599 


Homo sapiens 


QJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


2391 


96 

! 
) 


24? 


U32274 


Saccharomyce 
s cerevisiae 


Ydr3 86wp; CAI : O.li 


191 


I 


24E 


YS2719 


Home 
eapiens 


Human PR0864 protein 
sequence . 


1079 


100 ! 

j 


24S 


ABC29434 


Homo sapiens 


ghrelin precursor 


611 


100 • 


250 


X97831 


Ra t t u s 
norvegicus 


carnitine/acyj.carni t me 
carrier protein 


246 


3fc 


253 


WBC993 


Home 
sapiens 


Human RIP- interacting iactor 
RIF. 


1724 


100 


25^ 


Y94673 


Home 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 


W59876 


"Homo sapiens 


Amino acid sequence of the 
CDNA clone AIF-2 (HEBGM49) . 


765 


100 


254 


AL354533 


Lei shmania 
major 


possible adenylate kinase 


265 


34 


25^ 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


91 ! 


256 


Y7B113 


Homo sapiens 


Human cytokine signa} 
regulator CKSR-1 SEO ID 
NO~I . 


2247 


n | 

! 


25-; 


AL035S39 


Arabidopsis 
thaliana 


putative amino acid transport 
protein 


390 


21 j 
I 


256 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 . 


1171 


100 | 

1 

1 


25S 


AL0J5689 


Homo sapiens 


dJ18 7Jll.l (novel protein 
similar to protein kinase C 
inhibitors) 


974 


IOC 


260 


AE00O909 


Methanobacte 
rium 

thermoautotr 
ophicum 


serine/threonine protein 
kinase related protein 


363 


30 


261 


AL050131 


Homo sapiens 


hypothetical protein 


626 


100 


262 


AFDI9661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


263 


AL035593 


Homo sapiens 


dJ310J6.1 (novel protein) 


821 


10C ! 


264 


AL022318 


Homo sapiens 


"bK150C2.3 (PUTATIVE novel 
protein similar to AP03EC1) 


1072 


10C 

I 


f~2Ts 


AF205940 


Home sapiens 


enaomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


dJ500I>14.1 (novel protein) 


789 


10C 


267 


AL034548 


Komo sapiens 


dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C6FW) 


1888 


99 
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TABLE 2 



r SEQ 

, ID 
NO: 


ACCESSION 
NUMBER 


SFECIES 1 DESCRIPTION 

1 
1 


SMITH- \ "] 
WATERMAN | IDENTITY ] 
SCORE | 


2 6h 


Ar J 614 7C 


Hcmo sapiens 1 KSPC12I 


1884 


96 1 


269 


AF161470 


Homo sapiens 


HSPC121 


1232 


. st i 


2 70 


X9076 3 


Home, 
sapiens 


HHa5 hair keratin type 1 
intermediate filament 


2190 


S c 


272 


AF207600 


Homo sapiens 


ethanclamme kmast 


1952 


10C 


272 


M32334 


Homo sapiens 


intercellular adhesion 
molecule I 


143G 


10( 


273 


AF161483 


Homo sapiens 


HSPC134 


663 


6: 


274 


YS3C52 


Hcmo sapiens 


Human secreted protein clone 
df202_3 protein sequence SEQ 
ID K0:11C- 


587 


IOC 


276 


Y77S76 


Homo sapiens 


Human cytoskeletal protein 
(KCYT) (clone 2195418). 


762 


ICO 


277 


AFO77042 


Homo sapiens 


3 OS ribosomal protein S7 
homol og 


126? 


10C 


278 


Y94907 


Homo sapiens 


Human secreted protein clone 
cal06_19x protein sequence 
SEQ ID NO: 20. 


1619 


96 


279 


Y68788 j Homo sapiens 

i 


Amino acid sequence oi a 
human phosphorylation, 
effector PHSP-2G. 


2803 


99 


28C 


Z75134 


Cams 

£ amil iaris 


rod traneducir. 


1816 


IOC 


281 


Z75134 


Cani£ 

f amil iaris 


rod transcucin 


1718 


96 


282 


AF249873 


Homo sapiens 


muscle-speciiic protein 


1395 


100 i 


283 


AbO5O0O7 


Homo sapiens 


hypothetical protein 


405 


96 | 


284 


AF201531 


Homo sapiens 


DCl 


1859 




285 


AF156102 


Homo sapiens 


ELb complex EAP30 subunit 


1318 




286 


Y35897 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
146. 


1250 


95 ! 

1 


287 


U88964 


Homo sapiens 


HEM4S 


923 


100 


288 


AL050143 


Homo sapiens 


hypothetical protein 


598 


10C 


28S 


AJ011098 


Home sapiens 


telethonm 


S74 


10C 


290 


Y66724 


Home 
sapiens 


Membrane -bound protein 
PR0836. 


2323 


106 


291 


Ar UJ4 801 


Homo sapiens 


Iiprin-alpha4 


2565 


96 


292 


AF034DC1 


Homo sapiens 


liprin-alpha4 


2590 


100 


293 


AL04 9851 


Homo sapiens 


dJ889J22B.l (novel protein 
( isof orm 1) } 


1738 


10C 


294 


Y73346 


Homo sapiens 


HTRM cione e3965l protein 
sequence . 


124& 


9£ 


295 


L11672 


Homo sapiens 


zinc finger protein 


1694 " 


44 


256 


AL035423 


Homo sapiens 


dJ20l3.1 (brain mitochondrial 
carrier protein-1 (BMCP1)) 


1024 


79 


297 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
f actor-l 


2173 


10G 


298 


AF161417 


Homo sapiens 


HSPC295 


1147 


et 


299 


AF15914I 


Homo sapiens 


breast cancer metastasis- 
suppressor j 


1236 


99 


300 


U26397 


Rattxu 
norvegicus 


inositol polyphosphate 4- 
phosphatast 


160 


30 


301 


AF036145 


Homo sapiens 


meningioma-expressed antigen 
S 


3458 


IOC 


3C2 


Z82022 


Homo sapiens 


GlcNac-l-P transferase 


2067 


99 


303 


AP269232 


Mus musculus 


butyrophilin-like protein 

13UrR-l 


271 


50 


304 


AJ222644 


Arabidopsis 
thaliana 


asparaginyl-tRNA synthetase 


659 


50 


305 


AF054180 


Home 
sapiens 


hematopoietic cell derived 
zinc finger protein 


351 


79 


306 " 


AO272079 


Homo sapiens 


APOBEC-1 stimulating protein 


3056 


10C 


308 


Y44486 


Home 
sapiens 


Human GPRW receptor 
polypeptide . 


1721 


100 


309 


AJ131891 


Homo sapiens 


DMA polymerase mu 


2598 


100 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION ; SPECIES DESCRIPTION 
NUMBER i 

! 1 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


31C 


AF29333S | Homo sapiens 


P3 0 DBC 


i24e 


9'- 


311 


AF17652S | Mus musculus 


F-box protein FBLl^ 


1501 


Q j 


31i 


X578G2 | Komo sapiens 


immunoglobulin lambda light 
chain 


959 - 


e: 


313 


Z36715 


Komo sapiens 


Net 


2048 


Sb 


314 


AF161532 


Homo sapiens 


KSPC047 


727 


10c 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


300 


316 


Y6666e 


Home 
sapiens 


Membrane -bound proteii. 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-i. 


1253 


5E 


31E 


AJ3877&7 


Home sapiens 


siaiin 


2614 


C L 


315 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Wo mo qAnipn^ 


Amino acid sequence of a 
human phosphorylation 
effector FHSP-b. 


2243 


95 


321 


AJ238375 


Homo sapiens 


putative TH1 protein 


3013 


10C 


322 


AB040812 


Komo sapiens 


protein kinase PAK5 


37S2 


9 f - 


323 


YS5013 


Homo sapiens 


Human secreted protein 
vc48 1, SEQ ID N0:6fc. 


913 


100 


324 


Y13383 


Homo sapiens 


Amino acid sequence of 
protein PR0271 . 


1976 


1G0 


32b 


Y94944 


Home sapiens 


*"rfl^"7 T f> tirnrpin .spmipnrp 

*J1.A^> 1 XV pXULC^Ii OCVjUCIILC 

SEQ ID NO : 94 . 


2305 


9e 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein-7=;equenc£ . 


6726 


55 


327 


AF19853^ 


Home sapiens 


lymphoid enhancer binding 
factor-1 


2173 


IOC 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Dtosophila 
Cadherin- related tumor 
suppressor 


569 


33 


32S 


AF2I2921 


Mus mus cuius 


MMTV receptor variant 3 


484 




330 


Z7533C 


Home 

sapiens] 
>R6S20'J 
R65207 02- 
MAR-19S5 27- 

Human 

bUi vJI nd X X ] 1 "* JL . 

(Home 
sapiens 


nuclear protein SA-3 


6492 


95 


331 


AL008583 


Homo sapiens 


dJ327J16.3 (supported. by 
GENSCAN, FGENES and GENE WISE) 


2133 




332 


Y36104 


Komo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


4: 


333 


AJ271669 


Homo sapiens 


putative sialociycoprotease 


174 7 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99056 


Eimeria 
maxima 


emlOO gene is homologous the ; 154 
Eimeria tenella gene etlOO i 


2t 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3386 


$1 


337 


Y8S564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


2602 | 9^ 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 j 96 


335 


Z6656I 


Caenorhabdit 
is eiegans 


Similarity to Human rablj 
protein (PIR Acc. No. 
A49647) . 


716 


|34 


34C 


AB021643 


Home 
sapiens 


gonadotropin inducible 
transcription repressor-:- 


276: 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


Bi> 


342 


AF0205S1 


Homo sapiens 


zinc finger protein 


1051 


Ah 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


435 


84 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMEEK 


SPECIES 


DESCRIPTION 


SMITH- i H 
WATERMAN j IDENTITY 
SCORE j 






VDJ region I 




344 


U10263 


sus scrofa 


gastric mucin 


279 


24 


345 


AX0004 04 


Komo sapiens 


unnamed protein product 


1177 


99 


346 


L2255', 


Rat tut 
norvegicus 


calmodulin- binding protein 


1941 


84 


347 


L2255- 


Rattus 
norvegicus 


calmodulin -binding protein 


236? 


9i 


348 


AL049481 


Arabadopsis 
thaliana 


AIGl-like protein 


310 


3C 


350 


AJ251S26 


Mus musculus 


cysteine and histidine-rich 
protein 


1460 


95 


351 


AK024477 


Homo sapiens 


FLJ00070 protein. 


1773 


100 


352 


U50132 


Homo sapiens 


ankyrin 


502 


33 


353 


AK00062S 


Homo sapiens 


unnamed protein product 


721 


10C 


354 


AF161420 


Homo sapiens 


HSPC302 


262? 


97 


355 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


356 


AF151029 


Homo sapiens 


HSPC19^ 


943 


91 


357 


7\L022327 


Homo sapiens 


dJ355C18.1 {KIAA0027J 


1911 


100 


358 


W7812P 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


1117 


100 


359 


X03414 


Drosophila 
melanogaeter 


Kr polypeptidi 


316 


45 


360 


AF1S1079 


Homo sapiens 


HSPC24 1 


643 


100 


361 


Y538B6 


Homo sapiens 


A suppressor ot cytokine 
signalling protein 
designated HSCOP-6 . 


530 


43 


362 


AF2S4741 


Drosophila 
nelanogaster 


Centaurin Gamma 1A 


681 


46 


363 


AF21346S 


Homo sapiens 


dual oxidast 


201f 


100 


364 


AF161562 


Homo sapiens 


proSAAS 


1319 


100 


365 


AFI61562 


Homo sapiens 


proSAAS 


1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


864 


82 


367 


AF263744 


Homo sapiens 


erbb2 -interacting protein 
ERBIN 


4973 


95 


368 


U375 03 


Mus musculus 


laminin alpha 5 chain 


5867 


72 


369 


AF04369S 


Caenorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


545 


36 ! 


370 


Y7344C 


Komo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO:102 . 


1484 


99 


371 


AF272833 


Homo sapiens 


mi sate 


2869 


97 


372 


AF1984 54 


Homo sapiens 


epithelial protein lost in 
neoplasm bete 


3927 


100 


373 


Y7334S 


Homo sapiens 


HTRM clone 436283 protein 
sequence. 


273 


80 


374 


AF16 9017 | Komo sapiens 

i 


f ormi mi no t r an s t e r a se 
cycl odeaminasf 


2717 


98- 


375 


A9S106 | unidentified 


RED ALPHA 


1202 


99 


376 


W74E2f 


Komo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLOA352 . 


ioi: 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


379 


AF090934 


Homo sapiens 


PRO0518 


382 


100 


380 


X66565 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


383 


Y41€99 


Komo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7006 


98 


383 


U64606 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.£ 


246 


36 


384 


US0133 


Homo sapiens 


ankyrin 


502 


33 


385" 


AJ238520 


Homo sapiens 


putative transcription 
factor-like nuclear regulator 


4123 


97 
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TABLE 2 



SEO 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- : i 
WATER MAN J IDENTITY 
SCORE ! 


387 


AF2 08841 


Homo sapiens 


bm-oc;- 


137S | S c 


389 


X5762: 


Homo sapiens 


immunoglobulin lambda light 
chair. 


7S'. \ it 


390 


AF182404 


Homo sapiens 


mitochondria} uncoupling 
protein 1 


1670 


99 


391 


Y85564 


Home sapiens 


Human homclogue of UNC-53 
(Hs-UNC-53/1 ; sequence. 


338t 


97 


393 


AF17843;. 


Homo sapiens 


SH3 proteir. 


37 0C 


100 


394 


AF22992f 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


163 I- 


62 


395 


AF181723 


Home sapiens 


RU2S 


2254 


10C 


396 


Y69197 


Home sapiens 


Amino acid sequence of a 
human betalv- spectrin 
protein. 


162 6 


98 


397 


134823b 


Mus musculus 


zinc linger protein neuro-d4 


74 5 


6C 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF21752^ 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004 655- 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase ; similar to 
Q02218 (PID:cl}S26l8) 


417C 


78 


402 


AB01026* 


Mus musculus 


tenascin-X 


1024 6 


62 


403 


AL13328fr 


Homo sapiens 


dJ671D7.1 (simixar to 
D. melanogaster CG5986 
protein) 


761. 


10C 


404 


Z68753 


Caenorhabdit 
is elegans 


ZCS18 . 3b 


88t 


46 


405 


Z7801? 


caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


565- 


33 


406 


AB03:23G 


Homo sapiens 


protein containing CXXC 
domain / 


119£- 


97 


407 


AF155101 


Homo sapiens 


NY-REN- 36 antigen 


116fc 


100 


408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-6S . 


153lr 


99 


409 


Z1836J 


Ovie aries 


trichohyalin 


184 


30 


410 


AF249744 


Homo sapiens 


RhoGEF 


273? 


100 


411 


AF176 52S j Mus musculus 


F-box protein FBX13 


2071 


94 


412 


AF21084 2 | Homo sapiens 


HARF 


4 8 80 


100 


413 


AL032655 


Homo sapiens 


dJ3l0O33.7 (novel protein 
similar to H. roretzi HRPET- 
3} 


770 


98 


414 


X57396 


Komo sapiens 


pm5 protein 


613J 


95 


415 


AB02982£ 


Homo sapiens 


3-methylcrotonyi -CoA 
carboxylase biot in-containing 
subunit 


296; 


99 


416 


U43502 


Saccharomyce 
s cerevisiae 


Lphlp 


111 


42 


417 


AL1604S.' 


Lei sh mania 
major 


possible t26fl7.21 


23? 


35 


418 


Y0810C 


Homo sapiens 


Human PR0331 piotein. 


330 


29 


419 


U15131 


Homo sapiens 


pl26 


222t 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


ankyrin 2 


755 


30 


422 


AF3 0215C 


Home 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


196i 


100 


423 


AL13753C 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 




7269 


100 


425 


AB02724S 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1064 


55 
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SEQ | ACCESSION 
II j NUMBER 

NC: \ 


SPECIES 


DESCRIPTION 


SKITK- 
WATERMAN 
SCORE 


* ! 

IDENTITY | 

I 

1 


4 27 | AF279144 

! 


Homo sapiens 


tumor endothelial marker 1 
precursor 


12SS- 


56 


426 


AE003683 


Drcsophila 
melanogaster 


CG63I2 gene product 


I4 C | 


2 9 


425- 


Y07829 


Homo sapiens { RING linger protean 


220j 


99 


4 30 


AP096897 


Dros oph i 1 a 
melanogaster 


pu s n o v e i 


4441 


4 7 


43: 


U41387 


Homo sapiens 


Gu protein 


402} 


99 


4 32- 




Homo sapiens 


nephrocystin 


3 7 fc j 


100 


433 


AF146760 


Home 
sapiens 


septih 2-like cell division 
control protein 


2264 


100 


434 


AB006697 


Arabidopsis 
thai i ana 


cleft lip and palate 
associated transmembrane 
protein-like 


86C 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP . 


1704 


100 


438 


AB040672 


Homo sapiens 


UBP-GslNAc: polypeptide N- 
acetyioalactosaminyl trans f era 

St: 


1071 


63 


439 | AF105228 


Bos taurus 


tuftelin 


28S 


33 


440 i R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 4 0553) . 


307j 


99 


443 j X14971 


Mus musculus 


alpha-adaptin (A) (AA 1-S77J 


489'/ 


96 


442 ] X53773 


Rattui 
norvegicus 


alpha-c large chain (AA li - 
938) 


■1QTC 

j y i - 


81 


44? 


Y66689 


Home 
sapiens 


Membrane- bound protein 
?R0113t . 


3299 


99 


444 


AC067754 


Arabicopsis 
thalisna 


unknown protein; 20348-23707 


114 


j J 


44b 


AF229032 


Mus musculus 


p!3 | 2077 


93 


44t 


AF05603S 


Rattus 
norvegicus 


s-nexilin 


2662 


85 


4 4 7 


AFI32484 


Mus musculus 


unknowi. 


476 


51 ' 


446 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


4 SO 


Z68753 


Caenorhabdit 
is elecane 


ZC518.3b 


95i 

i 


4 9 


453 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3. 


155 


j £ 


45i 


W85727 


Homo 
sapiens 


Novel protein (Clone 
BM46_10) . 


279^ 


99 


453 


Y53629 


Homo sapiens 


A cone marrow secreted 
protein designated BMS11S. 


281C 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C.elegans I 406S 
protein in cosmid C14H10 '. 


100 


455 


AF240468 


Homo sapiens 


mcastrm 


3687 


100 


456 


Z15005 


Homo sapiens 


CENP-F 


1330^ 


99 


457 


M5S216 


Homo 
sapiens 


oamma-aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo s ap i en s 


Human secreted protein clone 
yd61_l protein sequence SEQ 
ID NO:156. 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene 18 clone 
HSLFM25 . 


535 


100 


46C 


AF163151 


Homo sapiens 


dentin sialophosphoprotem i 27S 
precursor 


19 


46a 


D87446 


Homo sapiens 


Similar to a C.elegans \ 9196 
protein encoded in cosmid 
C27F2 (U40419) I 


99 


462 


! GO4044 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 8125. 


1 486 


93 


463 


AC002398 


Homo sapiens 


. F25965 j 


1016 


200 


464 


AF064856 


Rattus sp. 


7acomp protein 


1845 


84 


465 


AF223408 


Homo sapiens 


E95 


3686 


99 
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SEC 
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DESCRIPTION 


SMITH- 


% 


3D 


NUMBER 


i 






~T r\f TJT T TV 

1 Dhti L 1 1 y 


NO: I 








SCORE 




46fc I AF22340e 


Homo sapiens | E9^ 


2871 


87 


~467 


AF104415 


Kus musculus j 


cene trap iocus-13 


633t 


91 


460 


U53450 


Rattus ! 
norvegicus 


Gun dimen nation protein ." 
JDP- j 


19'. 


4 £• 


469 


AL031297 


Homo sapiens 


dJ97?20.i (novel gene) 


3 564 


y 9 


470 


AF257077 


Homo sapiens 


eukaryotic translation 
initiation factor EIF2F 
subunit 3 


1274 


95 j 
| 


— 

473 


L28125 


Pooospora 
anserina 


beta transducin-like protein 


284 


i 

i 


47i 


V 84 903 


Homo sapiens 


A human proliferation anc 
apoptosis related protein. 


2337 


100 | 

i 


473 


AF144237 


Homo sapiens 


LOMF protein 


25i 


44 | 


474 


Y7122 3 


Homo sapiens 


Human irritable bowel disease 
related polypeptide 1NX39. 


83 ir 


100 


4 7b 


Y9500i 


Homo sapiens 


Human secreted protein 
vel3_l, SEQ ID NO: 52. 


341: 


100 


47€ 


D38S4 9 


Homo sapiens 


ha!025 is new 


6531- 




477 


AF241230 


Homo sapiens 


TAKl-binding protein 2 


365t 


100 j 


478 


AL031534 


Schizosaccha 

romyces 

pombe 


putative asparagine synthase 


46: 


40 


4 79 


L28125 


Podospora 
anserina 


beta transducin-like protein 


232 


26 


480 


AF161544 


Homo sapiens 


HSPC05S- 


434 


77 


462 


AJ23B24B 


Homo sapiens 


centaurin beta2 


398f 


99 


482 


Z3 806a 


Saccharomyce 
s ccrevisiae 


malS, stal, len: 1367, CA1 : 
0.3, AMYH YEAST P0864 0 
GLUCOAMYLASE Si (EC 3.2.1.3) 


291 


23 


483 


AF161381 


Homo sapiens 


HSPC26: 


1404 


100 


484 


AF223468 


Homo sapiens 


AD021 protein 


131* 


100 


486 


X57527 


Homo sapiens 


alpha l(VIIl) collagen 


416( 


99 


487 


Y19062 


Homo sapiens 


39k2 protein 


247! 


10C 


486 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence . 


55 1 


5£ 


489 


AL021916 


Home 
sapiens 


b34I6.1 (Kruppel related Zinc 
Finger protein 184) 


4184 


100 


490 


X53773 


Rattue 
norvegicus 


alpha- c large chain (AA 3- 
938/ 


4675 


97 


491 


U52426 


Homo sapiens 


GOK 


145.S 


59 


492 


AL359773 


Leishmania 
najor 


possible threonine synthase 


702 


45 


493 


; AF226614 


Homo sapiens 


f erroportinl 


292^ 


100 


494 


\ Z93241 

i 


Homo sapiens 


dJ222E13.1 (novel protein 
with some similarity to 
Drosophila KKAKKN) 


513 


96 


4 95 


| AF036977 


Homo sapiens 


unknown 


1812 


100 


496 


U93564 


Homo sapiens 


p4 0 


133 


45 


497 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO:126 . 


357 


100 


49e 


AF069781 


Drosophila 
melanogaster 


Bem46-like protein 


653 


A "> 

43 


4 99 


Y16603 


Homo sapiens 


Human cell-cycle 
pnospnoprotem cectf-/ . 


1656 


38 


soo 


X70944 


Homo sapiens 


PTB- associated splicing 


3 883 


1UU 


50i 


AF027503 


Mus 

musculus 


putative membrane- associated 
guanylate kinase 1 


one 
ZVZ> 


36 


502 


AF282874 


Homo sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


G6 protein 


669 


100 


504 


AF208861 


Homo sapiens 


BM-019 


162S 


100 


505 


L09708 


Homo sapiens 


complement component C2 


4022 


100 


507 


X66285 


Kue musculus 


HC1 ORF 


115 


43 


508 


D00185 


Rattus 
norvegicus 


Na-r ,K+-ATPase alpha-subunit 


5227 


99 
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SEQ 

i 7n 
IV 

I NO : 




SPECIES | DESCRIPTION 

! 


SMI TH - 
WATERMAN 
SCORE 


3 DENTTTV 


5C C 


Y94971 


Homo sapiens 


Human secreted protein clone 
fal71_i protein sequence SEQ 
ID NO.-148. 


2176 


100 


51( 


A3019038 


Homo sapiens 


beta-1,4 mannosyitransterase 


781 


77 


51 j 


A3019038 


Homo sapiens 


beta-1,4 mannosyltransterase 


3347 


100 


5i: 


A3019038 


Homo sapiens j beta-1,4 mannosyltransierase 


3520 


95 


51? 


X84908 


Homo sapiens 


phosphoryi as e kinast. 


5729 


99 


514 


X52851 


Homo sapiens 


peptidylprolyl isomerase 


650 


76 


Sli 


AF186084 


sap i en? 


repeat containing protein 


3046 


99 


51t 


G03602 


nOlUO tapiciib 


Unman ciarrfthoH nrnrei n QPf 

xiunidji seticteu proiciU/ ocv 
ID NO; 7683. 


505 


99 


51'/ 


U04706 


Bos t auruc- 


50 JcDa protein 


3 74 9 


77 


51fc 


G00653 


Homo sapiens 


Human secreted protein, SEQ 
ID XIO: 4734 . 


530 


100 


51S 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


S2C 


Y99366 


Homo sapiens 


Human PR0147 5 (UNQ7 46) amino 
acid sequence SEQ ID NO: 86. 


3394 


97 


521 


AF2668S2 


Homo sapiens 


PTPLA 


1295 


300 


522 


riZi\> O W V _» D 


Archaeoglobu 
s fulgidus 


chromosome segregation 
protein fsmcl) 


152 


20 


52"' 




Homo sapiens 


immunoglobulin heavy chain 
variable region 


605 


97 






RattUt 
norvegicus 


are; 




p p 
v c 


_) <C _ 




Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


T *3 1 C 


a - 

t 


bit b 


Ar 14 bbbb 


Drosophila 
me lancg acker 


BcDNA.GH10229 


32C 


33 


527 


At 11^-^13 


Homo sapiens 


putative Rab5-interactinc. 
protein 


524 


n c 
/ i* 


52£ 


D49387 


Home 
sapien* 


NADP dependent leukotriene b4 
1 2 - hydroxy deny dreg ena s e 


1616 


100 


529 


Y3 0819 


Homo sapiens 


Human secreted protein 
encoded from gene S 


328 


32 


53G 


AL079335 


Homo sapiens 


dtf!32F21.3 <72.1 KDa protean 
(DKFZP564A032, SBBI8B) 
similar to mouse IFN-gamms 
induce MG11 . } 


1055 


9S 


531 


Y91506 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179. 


1159 


96 | 


532 


A /O J.1C 


LacHUI iidlJU J. L. 




576 


50 


53 


X76 11 6 


Caenorhabdit 
is elecans 


carrier protein (c2) 


506 


50 


534 


X32966 


Homo sapiens 


3-oxoacyl -CoA thiol ase 
propeptide (424 AA) 


1972 


100 


53L 


Y09267 


Komo sapiens 


flavin- containing 
monooxygenase 2 


2486 


100 


53t 


Z11773 


Homo sapi ens- 


SRE-ZBP 


2201 


99 


537 


D84224 


Homo sapiens 


tnethionyl tRKA synthetase 


4741 


99 


53fr 


D84224 


Komo sapiens 


methionyl. tRKA synthetase 


3887 


99 


535 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


2933 


96 


54C 


D84224 


Komo sapiens 


methionyl tRNA synthetase 


4529 


99 


542 


J03244 


Eos tourus 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


848 


77 j 


542 


Y92514 


Homo sapiens 


Human OXRE-ll. 


2301 


99 


543 


AF27171 9 

AT -L ' J. z. 


Homo 
sapienr 


Smau- and Olf -interacting 
2inc finger protein 


21 51 


61 


54 4 


AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


207 


38 


545 


A0666S 


synthetic 
construct 


preTGF-betal j 2070 

i 


99 
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JC 

NO: 


y \V- V. fc a O X \JVi 


SPECIES 


EESCRI FT10K 


SMITH- ; 
WATERMAN 
SCORE ! 


s 

IDENTITY : 

i 


54 6 


Y0269& 


Homo sapiens 


Human secrecec proten. 
encoded by gene 4 9 clone 
HTPCS6C . 


854 

j 




54" 


AF112205 


Homo sapiens 


WSB-1 protein 


2275 


10C \ 


546 


X60271 


ftus musculus 


c-rel 


2264 


74 


549 


AC016827 


Arabidcpsis 
thaliana 


putative GTPasr 


810 


4; 


5S0 


Y70400 


Homo 
sapiens 


Human cell- 9ignall inc 
protein- 2 . 


429 


6e 


551 


A3048365 


Homo sapiens 


NEDD4- like ubicuitin ligase 1 


8290 


9S: 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 


1112 


91 


553 


AF119855 


Homo sapiens 


PR01847 


265 


67 


554 


Ml 723 V 


Homo sapiens 


WHC HLA-DQ alpha precursor 


1332 


10C 






thaliana 


nutar i vp Drot"Pi T 


540 


4 0 


55£ ! 


AC006S63 


Homo sapiens 


similar to Xelch proteins? 
similar to BAA77C27 
(PID:g4650844) 


515 


44 

! 


557 


AK0244 87 


finmo saDi f-rifi 


FLJO0086 protei*. 


1623 


9fc 


558 


M1214C 


Homo sapiens 


pol gene protexn; Xx>: 


117 


4f i 


ecc" " 






encoded by gene 97 clone 


225 


5( 


560 


X5658I 


Homo sapiens 


junD protein 


373 




DO J. 


^Tri nit 

/if UJ^l. C 


l_clc J IOX JiclIJUl L. 

is elegans 


rr»T\)*ainc vpalf ei rri-i ~\ a r"i t" V to 

an AMP-binding moti: 


2926 


54 


562 


AL10983 9 


Homo sapiens 


dJ10€9P2.3.1 {novel PAEPCI 
\poj.y \jr\f -Dinoiny piy^ciJij 


877 


100 


563 


Arlel©4U 


Drosophila 
me la nog as t 6r 




289 


41 


bos 




Fel int 
leukeroi s 
virus 


gPr80 


1547 


4 * 
* - 


565 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28817 


Homo sapiens 


pt326 4 secreted protein. 


3338 


10C 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


569 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3603 


9:- 


570 


AF155113 


Homo sapiens 


NY-REN- 55 antiqer. 


3951 


95 


571 


AX032821 


Homo sapiens 


dJ55C23.1 Ivanin I) 


1821 


Sir 


572 








7350 


9i 


573 


M69181 


Home sapiens 


non- muscle myosin £ 


7311 


9fc 


574 




Homo sapiens 


<lc*i~ro f£*r\ nrnhP "i Y"l 1 0 ft - 0 0 fl - - 0 - 

^CUI C LcU JJJ.Ul.Cj.il JL V O \J \J O -> w 

E6-FL. 


772 


100 


C7t; 

D / 5 




rti d.U J. uupiilb 

t haliana 


putative . tc *i j 


788 


4C 


576 


AJU365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 


X06745 


Homo sapiens 


DNA polymerase alpha - subunit 
(AA 1 - 1462) 


7619 


C.C 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


10C 


575 


D86984 


Homo sapiens 


similar to yeast aaenylate 
cyclase (S56776) 


2446 


10C 


580 


AF165124 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Komo sapiens 


Polypeptide fragment encoded 
by gene 56. 


2339 


99 


582 


U82319 


Homo sapiens 


novel ORF 


342 


100 


583 


P92219 


Homo sapiens 
(human) 


CR1 protein. 


11425 


99 


1 584 


AJ223948 


Komo sapiens 


RNA helicase 


6608 


95: 


1 585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


3874 


99 


586 


Y42384 


Home 
sapiens 


Amino acid sequence of 
Iv3l0 7. 


1007 


37 


587 


AF129756 


Homo sapiens 


BAT4 


1873 


se 
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ID 
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ACCESSION 
NUMBEK 


SPECIES 


DESCRIPTION 


SMITH- % 
WATERMAN 1 IDENTITY 
SCORE 


588 


AF13177I:. 


Homo sapiens 


Unknown 


1525 99 


585 


AJ25086 5 


Homo sapiens 


TESS '< 


234i ; 100 


593 


2988eb 

! 


Homo sapiens 


dO 52 2 J7. 2 (bromodomain- 
containing l (similar tc 
peregrin, BR140) ; 


416'. | 


IOC 


592 


L76573 


Homo sapiens 


nuclear hcrmcne receptor 


13 5 5 


10G 


593 


AF0916^ 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


X56B07 


Homo sapiens 


desmocollin type 2a 


4442 


10C 


595 


AL1378C2 


Homo sapiens 


dJ798A10.1 i novel protein) 


212 I 55 


596 


AL022325 


Home 
sapiens 


bK407F11.2 (adrenergic, beta, 
receptor kinase 2} 


3652 


100 


597 


AF22604£ 


Homo sapiens 


GL003 


2005 


55 


598 


AJ2781I* 


Home 
sapiens] 
>Y49635 
Y49635 21- 

APR-1998 
iiuntcui sup j . d 
nrnhp ■} n 

[Homo 


putative cell cycle control 
protein 


335 


23 


595 


Y5974 2 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10- 


1574 


55 


600 


L3653 j 


Homo sapiens 


inteorin alpha 8 subunit 


5386 




601 


Y38458 


Homo sapiens 


Human secreted protein 
encoded by oene No. 20. 


895 


IOC 


602 


AF218584 


Homo sapiens 


gga: 


3265 


3 00 j 


603 




riOuiU ScpicUS) 




5073 


5 5 


604 


AL132776 


Homo sapiens 


riil^93ni2 1 (KIAA0776) 


2 413. 


95 ' 


605 


AL034452 


Homo sapiens 


dJ6B2J15.1 (novel Collagen 

hrinlp iv reneat' 

t JL J.yAC Ai ^ Jt *L /V ±. ^ ^/Q^ C* V. 

containing protein) 


2979 


1 00 


606 


Y14494 | Homo sapienB 




3465 




607 


AJ0C1982 J Homo sapiens 


OXA1L 


2603 


100 


608 


X86G98 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


3069 




610 


AF163 572 


Homo sapiens 


Forssman glycolipid 
synthetase 


1865 


cc 


611 


AF161503 


Homo sapiens 


HSPC154 


1261 


57 


612 


L41834 


Ensis minor 


nuclear proteir. 


34 5 


3 0 


613 


Y91954 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSXP-9) . 


3668 


2 00 


614 


AL022327 


Homo sapiens 


dJ355Cl8.1 (KIAA0027) 


| 261 


54 


615 


X85766 


Homo sapiens 


binding regulatory factoi | 3203 


2 00 


616 


Y08319 


Homo sapiens 


kinesin-2 


3487 


c c 


61*3 


D12644 


Mus musculus 


KIF2 protein 


3605 


5*> 


616 


U2876S 


Mus musculus 


PACT I 593 6 


6 5 


619 


Y35914 


Homo sapiens 

i 


Extended human secreted 
protein sequence, SEQ ID NO. 
163. 


| 1684 

l 


55 


620 


AB046382 


Mus musculus 


testis-abundant finger 
protein 


195 




621 


Y00062 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


5 5 


622 


AF068286 


Homo sapiens 


HDCMD38P 


863 ! 300 


623 


X98248 


Homo sapiens 


sortilin 


4436 j 55 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


625 


S58544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


co 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


52 


627 


X14966 


Homo sapiens 


Rll-alpha subunit (AA 1-404} 


2075 


300 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 
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DESCRIPTION 


SMITH - 
WATERMAN 
SCORI- 


* 

IDENTITY 


629 


Y50913 


homo sapiens 


Human tetal brain cDNA clone 
vb7_l derived protein 


1694 


101 


630 1 


AF09878! 


Home 
sapi ens 


17 beta-hydroxysteroic 
dehydrogenase type via 


1754 | lOt 


633 


ALD3455H- 


nomc 
sapiens 


d0134019.3 Uinc finger 
protein 151 (pHZ-67) ) 


4271- 


100 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


7S4 


Si 


633 


AF288288 


Home sapiens 


HPT proteir. 


223e 


. 10c 


634 


AF04142S 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Komo sapiens 


serine/threonine protein 
kinase 


158^ 


10c 


636 


Y11284 


Homo sapiens 


AFX 3 


257 ;i 


9f 


63/ 


ARO048B4 


Homo sapiens 


PKU-alphc. 


37ie 


9_ L 


636 


AJ002303 


Homo sapiens 


synaptogyrin 1c 


102C 


10 c 


635 


AJ002304 


Homo sapiens 


synaptogyrin lb 


100; 


100 


64C 


AJ002303 


Homo sapiens 


synaptogyrin lc 


93± 


94 


643 


DB7682 


Homo sapiens 


similar co a C.elegans 
protein encoded in cosmid 
T26A5 . 


267fc 


IOC 


642 


M14 660 j Homo sapiens 


ISG-K54 


247? 


99 


643 


X06661 


Komo sapiens 


calbindin (AA 1-261) 


1356 


100 


644 


AF119900 


Homo sapiens 


PR0262? 


185 


7( 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


T, 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi-2 protean 


1011C 


9<r 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B13 
home log 


827 


96 


64S 


AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 


3830 


93 


650 


AL.034553 


Homo sapiens 


dJ9i4P20.2 (KIAA0784 protein 
similar to Mus muscuius 
act ivi ty- dependent 
neuroprotective protein 
(Adnp) ) 


5708 


100 


653 


X14766 


Homo sapiens 


GA3A-A receptor alpha J 
subumt 


2388 


99 


654 


AC004614 


Homo sapiens 


similar to i-spondin proteins 
AB006086 (PID:g2529225) 


302t 


99 


655 


Y5790B 


Homo sapiens 


Human transmembrane protein 
HTMPN-22. 


608 


99 


656 


Z34975 


Homo sapiens 


IdlCc 


3732 


10C 


658 


AL050306 


Homo sapiens 


dJ475B7.2 inovel protein) 


194; 


99 


659 


W76734 


Home 
sapiens 


Human mDia Rho targeting 
protein . 


763 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 I Homo sapiens 


mPOU homeooox protein 


1529 


100 


662 


AJ242954 | Mus muscuius 


dysferlir. 


4752 


S9 


663 


API 823 16 j Homo sapiens 


tnyof erl ir. 


6232 


99 


665 


AL161516 


Arabidopsis 
thaliana 


hypothetical protein 


209 


30 


667 


X59303 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y13355 


Homo sapiens 


Amino acic sequence of 1 
protein PR022C . 


3(592 " 


10C 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


612 


52 


671 


X56123 


Kus muscuius 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


homo sapiens 


TCP13 


806 


42 


674 


AF229633 


Mus muscuius 


grouchor related protein 4 


4053 


99 


675 


L14463 


Rattus 


• transcucm 


3619 


92 
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TABLE 2 



PCT7US00/342r>3 



SEQ j ACCESSION | SPECIES 
ID ! NUMBER 
NO: 


DESCRIPTION 


SM3TH- 
WATERMAN 
SCORE 


IDENTITY 




norvegicus 








67£ 


AC005757 


Homo sapiens 


R32G12_ : 


2775- 


100 


67? 


S6lQCi 


Homo sapiens 


reverse transcriptase 
hcmolognpol {retroviral 
e) ement ) 


25: 


65 


676 


AF271386 


Homo sapiens 


CtfP-N-acetylneuraminic acid 
synthase 


227> 


100 


679 


X7906C 


Homo sapiens 


ERF- j 


1767- 


100 


680 


AF118566 


Mus musculus 


hematopoietic zinc linger 
protein 


76* 


50 


683 


YS1415 


Homo 
sapiens 


Human wild type pXe83 
protein. 


262; 


99 


682 


AL133545 


Komo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


66 


663 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb34l protein 
sequence . 


588t 


99 


684 


V94 95:- 


Homo sapiens 


Human secreted protein clone 
fhll6_il protein sequence 
SEQ ID N0:110. 


35'; 

- 


9e 


685 


AL021878 


Komo sapiens 


dJ257J20.4 (transcription 
factor 20 (AR1) (KJAA0292) 
(isciorm 2) ) 




67 


686 


AE000196 


Escherichia 
coli 


ort , hypothetical protein 


62t 


100 


687 


M5837t 


Homo sapiens 


synapsin 1 


3730 


99 i 


68B 


AF039697 


Homo sapiens 


antigen NY-CO-31 


sot 


96 | 


689 


U0935b 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 E 
gamma subunit 


235t 


99 

I 


690 


AF155106 


Homo sapiens 


NY -REN- 3 6 antigen 


26L 


5C 


6s; 


AC004774 


Homo sapiens 


Dlx-I 


154; 


100 


SB'S. 


X9O530 


Homo sapiens 


ragF 


192* 


99 


6S3 


X90530 


Homo sapiens 


ragl- 


140^ 


9S 


6S4 


X90530 


Homo sapiens 


ragt 


1590 


85 


69b 


G01563 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 5644. 


330 


IOC 


696 


AC011810 


Arabidopsis 
thaliana 


Putative methionine 
aminopeptidase 


66^ 


52 


697 


AJ250425 


Rattus 
norvegicus 


Collybistin : 


245b 


96 


696 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma-l 


5364 


99 


699 


YS940; 


Homo sapiens 


Human PR01327 (UNQ68 7) amino 
acid sequence SEQ ID NO:216. 


138C 


100 


70: 


AF221712 


Homo 
sapiens 


Smad- and Olf- interacting 
zinc finger protein 


6705 


100 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1697 


94 


| 70S 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1736 


99 


706 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 

1 


AL022237 


Homo sapiens 


bXH9lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G0357- 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


77*/ 


99 


720 


Y086 9fc 


Homo sapiens 


ranbp5 


284? 


98 


713 


Y68770 


Komo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2 . 


754 


99 
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TABLE 2 



PC77US0O/34263 



SEO 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SKJ7H- I 
WATERMAN j IDENTITY 
SCORE 


71 ; 


US357<i 


Homo sapiens 


putative pl5C 1 7S5 


59 


71 


ACO04 531 


Homo sapiens 


Gene with similaity to DEAE | 2711 
box helicases 


99 

i 


714 


DO 9 01 6 


Homo sapiens 


Neuroblastoma 


53t 


48 


71S 


Y 5217b 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine- 
phosphatase 2. 


734 


96 ( 

i 


716 


AL137013 


Homo sapiens 


bA311P8.3 (probable uracil 
phosphoribosyltrar.f erase) 


862 


100 


717 


A5035123 


Mus mis cuius 


GDI alpha/GTla alpha /GQlb 
alpha synthase 


16SC 


93 


71c 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
1 Homo 
sapiens 


Human IGFAM-2 innmunoglobulin . 


2341 


B5 

I 

! 


715 


X07979 


Homo sapiens 


intecrin beta 1 subunit 
precursor 


434 7 


99 ! 

i 


72 t 


AJ224819 


Homo sapiens 


tumor suppressor 


214 ! 


99 


72j 


Y0759B 


Homo sapiens 


transcription factor TFIIK 


2373 


100 , 


722 


K4 3 56S 


Homo 

sapiens' 

>W41564 

W41564 08- 

OCT-1997 05- 

APR- 1996 

Human 

calpain. 

[Homo 

sapiens 


Kuman calpain. 


1S91 


99 

i 


723 


AF161341 


Homo sapiens 


HSPC07^ 


1097 


98 


724 


AF1 87318 


Homo sapiens 


F-box protein Fbx2 


1607 


| 100 


72E 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP3 3 
<GB:Z72876) 


1143 


46 


726 


AC006708 


Caenorhabdit 
is eleqans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP33 
(GB:Z72876) 


988 


46 


727 


ACC24 818 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF00400 (WD domain, 
G-bets repeat), score-81.6, 
E=*1.4e-20, N*=3 


95C 


44 


726 


A0005897 


Homo sapiens 


JMb 


831 


47 


725 


Y4 5377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27 . 


908 


97 


73 0 


G0393I 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 8012. 


57c- 


100 


733 


AB012720 


Oncorhynchus 
ma sou 


GTP-binding protein 


386t 


76 


732 


W73004 


Homo sapiens 


Human secreted protein 
encoded by Gene No. 8. 


862 


97 


733 


G02650 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 6731. 


644 


97 


734 


AC024813 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54FlOAL.a 


152 - 


24 


735 


AL035461 


Homo sapiens 


dJ967N21.6 {novel CDP -alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


U00032 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
proteir. 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-prctein 
transferase 1-lp; ATBl-lp 


2733 


99 
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TABLE 2 



SEC 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SK1TH- 
WATERMAN 
SCORE 


% 

IDENTITY 


73 1 


AJ131712 


Homo sapiens 


nucieolar RNA-helicasf 


27 93 


100 ; 


73! F 


AJ133115 


Homo sapiens 


TSC-22-like proteir. 


2054 


95 j 


74 C 


X9825fc 


Homo sapiens 


M-phase phcsphoprotein 9 


952 


100 | 


743 


X98256 


Komo sapiens 


M- phase phosphoprotem S 


564 


74 | 


742 


V97192 


Coenorhabdi t 
is elegans 


strong similarity to the YPT1 
sab- family of RAS proteinp 


960 


8S 


745 ; X76057 


Homo sapiens 


phosphomannose iscmerase 


2193 


100 


744 | G0320S 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 7290. 


496 


9£ 


74 5 | X97064 


Homo sapiens 


Sec23 protein 


4034 


9S 


746 | W93946 


Komo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73386 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


74 £ I M19529 


Sus scrofa 


foliistatin A 


1906 


96 


74 9 j AJ24 9457 


Trichomonas 
vaginalis 


centrm, putative 


183 


26 


7 50 | AC0044IO 


Homo sapiens 


fos29554_: 


2094 


100 


75j 


AF074968 


Homo sapiens 


p47JNG3 protein 


2167 


100 t 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4 005 


100 


753 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


95 


754 


D79205 


Homo sapiens 


ribosomal protein L3 5 


160 


77 


755 


AB00843O 


Homo sapiens 


CDE1- 


142 


29 


755 


L32162 


Homo sapiens 


transcription facto2 


574 


80 


755 


AF037204 


Homo sapiens 


RING zinc finger proteir. 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


625 


100 


763 


AF218586 


Homo sapiens 


Cioe-b 


1136 


100 


76; 


U38934 


Gallus 
gallus 


hist one H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-P 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743} 


3626 


10C 


765 


D87446 


Homo sapiens 


Similar tc a C. elegant 
protein encoded in cosmic 1 
C27F2 (U40419) 


566 


36 


76 6 


AL023828 


Caenorhabdit 
is elegans 


Y17G7B.14 


200 


27 


767 


Y 82 77-7 


Homo sapiens 


Human chordin related protein 
(Clone dw665_4). 


2551 


9S 


76e 


X92475 


Homo sapiens 


itba: 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3). 


1426 


100 


770 


X51416 


Homo sapiens 


hormone receptor hERRl <AA 1- 
521) 


2641 


97 


773 


AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap; 


935 


100 


773 


Z12173 


Homo sapiens 


N-acetylglucosamine-fc - 
sulphatase 


2970 


100 


774 


Y91950 


Homo sapiens 


Human cytcskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 j 


776 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger' 


855 


5€ 


777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


776 


G01880 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 5961 . 


849 


96 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078.S82 


Homo sapiens | dJl30E4.2 (KIAA0796} 


1321 


66 


781 


Z75955 


Caenorhabdit 
ie elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


C1J1121G12.2 (SCAN domain- 
containing 1 protein: 


900 


1C0 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03373 


Homo sapiens 


Human secreted protein, SEO 


64 9 


95 
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SEO 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIE!' 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY ; 








ID NO: 7954. 








Y84441 


Homo sapiens 


Amino acic Sequence of < 
human RNA-associatec 
protein. 


2074 


100 

! 


786 


Y00918 


Homo sapiens 


Human Rata protein, RAB?-;., 
protein sequence. 


1046 


95 ; 


787 


Z97029 


Homo sapiens 


ribonuclease Hi large subur.it 


1548 


S9 J 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 i 94 | 


789 


AF024631 


Homo sapiens 


ang; 


264 4 


100 | 


790 


^AJ00673 0 


Kattus 
norvegicut 


phcsphatidylinositol 3-kinase 


4500 


97 


792 


V00636 


bacteriophac 
e lambda 


reading frame eaic 


600 


100 ! 

i 


793 


AF0491C3 


Homo sapiens 


Huntingtin mceractinc 
protein 


819 


100 i 


7 SE: 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Home sapiens 


Retinoblastoma binding 
protein- 7seguence . 


5080 


39 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


13B 


U97189 


Caenorhabdit 
is elegan* 


strong similarity to thw 
P13/P14 family of kinase.* 


227 


26 


799 


AF112201 


•Homo sapiens 


neuronal protein NP2S 


1053 


100 


800 

1 


AF23476L 


Rattui 
norvegicus 


serine- arginine-rich splicing 
regulatory protein SRRP8C 


956 


63 


so: 


AF267852 


Homo sapiens 


placental protein 13-likf- 
protein 


743 


99 


Z02 | AF208851 


Homo sapiens 


BW-009 


766 


80 


803 


Z81097 


Caenorhabdit 
is eleganr 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662dl2.! 
comes from this gene 


152 


27 

i 


804 


GO 2 11 3 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 6194. 


496 


98 

i 


BOS 


AL121673 


Homo sapiens 


bA305P22.1 (novel protein) 


1160 


ico i 


806 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 ! 

! 


807 


AC013483 


Arabicopsis 
thaliana 


putative GTPase activatoi 
protein 


264 


3C ! 

. 


806 


AB013B85 


Homo sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


SIC 


AF161421 


Homo sapiens 


HSPC3 03 


2134 


96 


611 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 

i 


£12 


Z74029 


Caenorhabdit 
is elegans 


similarity to C.elegans 
alcohol dehydrogenase comer 
from this gene 


610 


71 

I 


813 


Z73497 


Homo sapiens 


CU240C2.2 (Core histone 
H2A/H2B/H3/H4) 


324 


100 


j 814 


W87689 


Home 
sapiens 


Human HTXFT29 polypeptide. 


1484 


99 


815 " 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


826 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 i 


618 


AB03G483 


Mus musculus 


B9 


197 


27 i 


819 


AL11755S 


Homo sapiens 


hypothetical proteir. 


321 


94 


820 


AC005328 


Homo sapiens 


R26660^2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 8032. 


700 


99 


E22 


L34807 


Musca 
domestics 


transposase 


174 


20 


623 


GC2928 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 7009. 


558 


76 


E24 


Z99531 


Schizosaccha 


caffeine- induced death 


184 


29 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIEi j 


DESCRIPTION 


SMITH- j 
WATERMAN 
SCORE ; 


IDENTITY 






romycef 
pomb* 


protein 1 






82S 


AJ006692 


Komc sapiens 


ultra high suiter keratu. 


692 


6* 


826 


U23037 


Oryctolagus 
cuniculus 


elF- 2Bepsilon 


3406 


90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493. 


464 


100 


828 


Y30327 


Homo sapiens 


Human secreted proteir. 
encoded from gene 17. 


113 

j 


4', 

t 


829 


Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 1 


10( ! 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


o c 


832 


AB011542 


Komo sapiens 


MEGFS 


2097 


100 


833 


G02639 


Home sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulate: 
protein HCNGP 


1574 


IOC 


e35 


AF119664 


Homo sapiens 


transcriptional regulate? 
protein HCNGP 


1144 


8*: 


836 


AF119664 


Homo sapiens 


transcriptional regulatci 
protein HCNGP 


1448 


94 


837 


X12517 


Homo sapiens 


C protein (AA 1-15S; 


916 


10C 


836 


U32865 


Drosophila 
melanogaster 


linotte protein 


164 


26 


839 


AF067730 


Homo sapiens 


TLS-associated protein TASR-2 


631 


S( 


840 


U27831 


Homo sapiens 


striatum-enriched phosphatase 


2840 


91 


841 


AF286366 


Homo sapiens 


CamKI-like protein kinase 


1796 


100 


842 


GO2309 


Home sapiens 


Human secreted protein, SEQ 
ID NO: 6390. 


278 


9^ 


643 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


4f 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


625 


100 


845 


U27838 


Mus musculus 


glycosyl -phosphatidyl - 
inositol-anchored proteir. 
homolog 


3305 


96 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


048 


AF164794 


Homo sapiens 


Diff33 protein homoloc 


2398 


ior 


849 


U4131S: 


Homo sapiens 


ZNF127-Xp 


245E 


92 


850 


AF192784 


Homo sapiens 


makorin 1 


2062 


97 


851 


Y58628 


Homo sapiens 


Protein regulating gene 
expression PRGE-21. 


154B 


IOC 


852 


Z22968 


Homo sapiens 


Ml 30 antiger. 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 


| 854 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443 . 


330 


96 


855 


G03362 


Homo sapiens 


Human secreted protein, SEO 
ID NO: 7443. 


203 


100 


856 


AF285118 


Homo sapiens 


CGI-203 


4S2 


100 


857 


ACO06O69 


Arabidopsie 
thaliont 


putative cleavage and 
polyadenylation specif ity 
factor 


1383 


5S j 


858 


AL021546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Via -liver 
precursor (EC 1.9.3.1) 


593 


10C 


659 


L02956 


Xenopue 
laevis 


ribonucleoprotein 


1664 


8b 


860 


AF201947 


Komo sapiens 


MEK binding partner 1 


616 


10C 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is eleaans 


mitochondrial carrier protein 


370 


42 


864 


AF154108 


Homo sapiens 


tumor necrosis factor type l 


3559 


99 
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receptor associated protein 






865 


AEO0153 0 


Helicobacter 
pylori J99 


putative 


23 0 




866 


X57EC' 


Homo sapiens 


immunoglobulin lambda light 
chain 


69* 


s: 


867 


AL03167} " 


Homo sapiens 


d0694EI4.1 (PUTATIVE novel 
KRAB box protein with 28 C2H2 
type Zinc finger domains) 


4066 


99 

j 


866 


Y116SS 


Homo sapiens 


phosphate cycles* 


23 h | 100 , 


865 


A? 19296 6 


Homo sapiens 


high-giucose- regulated 
protein i 


3041 | 99 


870 


AB020646 


Homo sapiens 


KIAA0841 protein 


3237 j 95 


871 


AL031427 


Homo sapiens 


dJ167A15.1 (novel protein) 


160t | IOC 


872 


API 5153 4 


Homo sapiens 


core histone macroH2A2 .2 


1866 } 100 


"873 


AL02133! 


Homo sapiens 


dJ366N23.1 tputative C. 
elegans UNC-93 (protein 1, 
C46F1I.1) IrlKE protein) 


1125 


100 


874 


X14606 


Homo sapiens 


propionyl-CoA carboxylase 


3575- 


100 


875 


AU117334 


Homo sapiens 


dJ687Fil.l (novel protein 
(part of translation of cDNA 
DKFZp434N06l, Em:AL110249) ) 


306 


100 


876 


X79489 


Saccharomyce 
s cerevisiae 


E-925 proteii. 


44 6 


3' 


877 


YS30O2 


Homo sapiens 


Human secreted protein clone 
dn834_l protein sequence SEQ 
ID NO: 6 . 


aij 


10C 


878 


AF231064 


Homo sapiens 


CHMP1.5 


957 


100 


879 


X79417 


Sus scrofa 


40S ribosomal protein S12 


687 


100 


680 


AF001317 


Saccharomyce 
s cerevisiae 


Soilp 


476 


26 


881 


Y6727i. 


Homo sapiens 


Human signal peptide 
containing protein HSPF-52 
SEQ ID NO: 52 


2547 


10C 


882 


1*14 036 


Homo sapiens 


CI - inhibit or 


596 


7/ 


883 


AE04I26: 


Homo sapiens 


calcium- independent 
phospho lipase A2 


2903 


100 


884 


AF020313 


mus musculus 


proline -rich protein 4e 


999 


84 


88S 


YI093e 


Homo sapiens 


hypothetical protein 


1104 


99 


886 


AF073997 


Mus musculus 


myotubularin related protein 
1 


B66 


3( 


887 


Y57893 


Home sapiens 


Human transmembrane protein 

HTMPN- 1 *, . 


1099 


94 


888 


AL11763B 


Homo sapiens 


hypothetical protein 


929 


99 J 


889 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


2046 


99 


890 


Y36 031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


893 


Y3 603 3 


Homo sapiens 


Extendec human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 


AF23763J 


Homo sapiens 


ubiquitous tropomodulin U- 
Tmod 


1796 


10G 


893 


AF090929 


Homo sapiens 


PRO0477; 


653 


99 


894 


AL031226 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BING4 (similar to £. 
cerevisiae YER082C, M. sexta 
MNG10 and C- elegans F28D1.1) 


3196 


100 


89B 


AL031226 


Homo sapiens 


dJ1033BI0.2 (WD40 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


2825 


96 


896 


AF3 71102 


Homo sapiens 


retinal cegeneration B beta 


1302 


9E 


897 


AE003551 


Drosophila 
melanogaster 


CG18176 gene product 


633 


33 
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TABLE 2 



SEQ j ACCESSION 
ID 1 NUMBER 
NO: | 


SPECIES j DESCRIPTION j SMITH- 
\ j WATERMAN 
I ; SCORE 


% 

IDENTITY 


696 


AJ237946 


Homo sapiens | DEAD box Protean 5 ( 244 > 


10C 


89<- 


Z9718/, 


Komo sapiens ! KKE: i 624 


100 


900 


Z9716< 


Homo sapiens | KKE: | 4 05- 


98 


901 


AJ245587 


Komo sapiens | Kruppei - t ype zinc finger 


194; 


100 


902 


AF091034 


Homo sapiens 


GTP-bindmg protein RAB22A 


io:: 


100 


903 


R95951- 


Komo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04 732 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophiia 
melanogaster 


CG10984 gene product 


44fe 


33 


906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform ] 


2993 


98 


90*? 


H5554: 


Homo sapiens 


guanylate binding protein 
isotorm - 


2901 


96 


9oe 


W84 08. u 


Homo sapiens 


Human membrane fusion protein 
WDProi . 


ibbs- 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


91C 


AB029150 


Homo sapiens 


KRAB zinc finger protein 


219*- 


100 


91j 


G0287: 


Komo sapiens 


Human secreted protein, SE0 
ID NO: 6952 . 


52" 


100 


911 


G03162 


Homo sapiens 


Human secreted protein, SEC 
ID NO : 724 j . 


387 


87 


913 


AJ243721 


Homo 
sapiens] 
>Y92508 
Y92508 13- 

ado "Jrinn nc 
AJrrl Ub- 

OCT-1998 

Unman CY Z>P — 

5 . [Home 


dTDP-4-keto-6-deoxy-D-glucose 
4-reductase 


171 C 


100 


914 


U24189 


ra prior hah-ii f 
is elegans 


hypothetical protein 1207-1; 
Method: conceptual 
translation supplied by 
authors 


2<K 

i 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


915 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


913 


M23159 


Cricetus 
cricetus 


DHFR-coamplif ied protein 


163 


30 


919 


L12016 


Caenorhabdit 
is elegans 


putative 


123: 


41 


920 j AF102177 


Homo sapiens 


tumor antigen SI#P-8p 


1260 


97 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 


922 


AL161495 


Arabidopsis 
thalians 


putative WD-repeat protein 


866 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U9700I 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


SI 


925 


X71976 


Mus mus cuius 


Fif 


1503 


95 


926 


K92288 


Drosophiia 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y27.499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose- 5 -phosphate - 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 
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SEQ I ACCESSION 
ID ! NUMBER 
NO: j 


SPECIES 


DESCRIPTION | SMITJi- 
1 WATERMAN 
SCORE 


% 

IDENTITY 






is elecans 


cm2lc'. 


t 


932 


AL08006b 


Homo sapiens 


hypothetical protein 


210 


25 


933 


G01384 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5S6S. 


76 7 


96 


934 


A027646S 


Homo sapiens 


integral membrane transporter 
protein 


1200 


300 


935 


AL035661 


Homo sapiens 


dJ7S6G23.2 (novel protein | 13 4 2 
similar to drosophila j 
transcriptional repressor) 1 


80 


936 


AB026808 


Mus mus cuius 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRI HFB221 » 


260a 


99 i 


93B 


X65724 


Homo sapiens 


orf: 


496 


100 j 


939 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156. 


1487 


ioo ; 


940 


G04 04 7 


Homo sapiens 


Hum.cn secreted protein, SEQ 
ID NO: 612L . 


137 


100 


941 


AF094S83 


Homo sapiens 


putative HIV-1 infection 
related protein 


45* 


100 


942 


AC02420G 


Caenorhabdit 
is eleoans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 

domain i 


35C 


69 


943 


AF329756 


Homo sapiens 


G5c 


273 


100 


944 


K23765 


Rattus 
norvegicus 


alpha - tropomyosin 


13 j 


96 


945 


AC009917 


Arabidopsis 
thaliana 




56> 


47 


946 


AF22346B 


Homo sapiens 


A3021 protein 


551 


44 


947 


AF055473 


Homo sapiens 


GAGE - I 


271- 


51 


94 8 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


68 


949 


AF1439S6 


Mus mus cuius 


cor cn in** 


230C 


93 


950 


Y36729 


Homo 
sapiens 


Human PG1 protein sequence. 


1861 


99 


951 


W4 504J 


Homo sapiens 


Humar jnw dpsr^ihv linotilfotein 
bindino protein LBP-2. 


201 


67 

1 


952 


AB016B6I 


Arabidopsis 
thaliana 


gene id:MXCl7.7~ 


203 


46 


953 


Y01785 


Homo sapiens 


Human ubiqui tin-conjugating 
enzyme >Y25341 Y25341 01 -JUL- 
1999 3 2 -AUG- 1998 Human NCE-2 
proteii: . 


36L- 


100 


954 


AF145615 


Drosophila 
melanogaster 


BcDNA .GH03377 


62r 


46 


955 


U09410 


Homo sapiens 


zinc finger protein ZNF133 


2463 


99 


956 


U09410 


Homo sapiens 


2inc finger protein ZNF131 


3853 


99 


95-7 


AF195623 


Homo sapiens 


chol inephosphotransf erase 3 
alpbe 


2126 


99 


956 


X94917 


Drosophila 
melanogaster 


heac-e)evated expression in 
0.9 kfc 


15: 


32 


959 


U54807 


Rattus 
norvegicus 


GTF- binding protein 


1167 


97 


96C 


AF058807 


Bos taurus 


GTP- binding protein rah 


60f 


97 


961 


G03244 


Homo sapiens 


Human secreted protein, SEG 
3D NO: 732b. 


4 7.". 


100 


962 


AF07B850 


Homo sapiens 


steroid dehydrogenase homolog | 5e? 


40 


963 


AP0017S4 


Homo sapiens 


transient receptor potential - 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL035419 


Homo sapiens 


dJH00K13.I (putative nove] 
protein) 


1129 


100 


965 


X61383 


Rattus 
rattus 


interferon -induced protein 


20; 


46 


966 


D38169 


Homo 
sapiens 


inositol 1,4 , S-trisphosphate 
3-kinase isoenzyme 


3276 


100 


967 


AL031432 


Home 
sapiens 


dJ4 6SN24.2.1 (PUTATIVE novel 
protein) (isoforra 1) 


£91- 


100 
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SEC 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIE5. 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


V j 
IDENTITY j 


96fc 


U79275 


Homo sapiens 


unxnowr. 


611 


100 


9€> 


AJG11306 


Home 
sapien.' 


guanine nucleotide exchange 
factor (long isoform: 


275S 


9S- 


570 


AF261134 


Komo sapiens 


exosome component Rrp4fc 


1186 


100 


97; 


U53336 


Caenorhabdit 
is elegane 


weak similarity over a short 
region to myosin heavy chair. 


536 


23 


972 


AC01874S 


Leishmania 
major 


L8840.12 


r 589 


53 


S73 


AF1 88504 


Mus musculus 


I.NV 


r 544 


81 


974 


U25801 


Homo sapiens 


Taxi binding protein 


852 


9E 1 


Bli 


AF04 9523 


Homo sapiens 


hunting tin- in teractinc 
protein HYPA/FBP11 


1390 


97 


97( 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 


| 977 


G04O20 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 8101. 


626 


100 


1 97fc 


AF164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


97S 


U94991 


Xenopu? 
laevis 


transcription factor XLM03 


795 


97 


S80 


S7377S 


Homo sapiens 


calmitme; calseque serine 


2029 


100 


981 


Y94888 


Home 
sapiens 


Human protein clone HP014 61 . 


2501 


100 


962 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


981: 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 j 


98 < 


AJ249207 


Rhodococcus 
Sp. AD4 5 


putative racemase 


351 


43 


56!: 


Z30093 


Homo sapiens 


basic transcription tactox 2, 
35 kD subunit 


1576 


99 


986 


AB030835 


Homo sapiens 


contains two glutarr.ane rich 
domains, three 2inc-fingei 
domains, and matrin l- 
homologous domain 3 (NH3) 


4697 


99 


98' 


AF22725B 


Bos taurus 


RPGR-interacting protein- 1 


1262 


36 


9Bc 


AL022238 


Homo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES and GENEWISE) 


4048 


95 


989 


ALG2223B 


Homo sapiens 


dJ1042K10.2 {supported by 
GENSCAN , FGENES and GENEWISE) 


2321 


99 


990 


AF161426 


Homo sapiens 


HSPC30E 


448 


92 ! 


99j 


AF161426 


Homo sapiens 


HSPC306 


448 


92 ; 


99^ 


AF161426 


Homo sapiens 


HSPC308 


453 


92 


99 :- 


AL023859 


Schizosaccha 

romyces 

pombe 


trna-oplicing endonuciease 
subunit 


172 


42 


3S<: 


AL049631 


Homo sapiens 


dJ513M9.1 (novel Homeobox 
domain protein) 


241 


47 


95b 


AC0C5253 


Homo sapiens 


R26445 1 


902 


100 


996 


AF265206 


Komo sapiens 


M0G1 isoforrh A 


974 


100 


997 


AJ248285 


Fyrococcus 
abyss i 


sar cosine oxidase, subunit 
beta (soxB) 


195 


28 


99b 


AE003641 


Drosophila 
melanogaster 


BG:DS00941.3 gene product 


218 


5B 


999 


W69343 


Home 
sapiens 


Secreted protein of clone 
CR930 1. 


1340 


99 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenSank Accession Number 
M24102.1 


1543 


10C 


iooa 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


166B 


100 


1002 


AF208844 


Homo sapiens \ BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens | dJ4 62023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 
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I SEQ 

; ie 

I NO; 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY j 


lOOi 


S45367 


Canis 

familiarit 


cer.tractin 


1313: 


96 

i 


1007 j AB02215B 


Mur 

musculus 


chaperonin containing TCF-1 
epsilon subunit 


2649 


96 


10CE j Y76332 


Homo sapiens 


Fragment of human secreted 
prctein encoded by gene 3fc. 


1282 


97 


100S- IAB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1011 | Z68218 


Caenorhabdit 
is elegant 


K0jH12.1 


265 


67 

-p — I 


101} | AB011414 


Homo sapiens 


Kruppel-type 2inc ringer 
protein 


1671 


i 


1012 f Z1400C 


Homo sapiens 


ring: 


2017 


100 


1013 | G02642 


Homo sapiens 


Human secreted protein, SEC- 
ID NO: 6922. 


332 


23 ! 


1014 


AF145659 


Drosophila 
melanogaster 


BcDNA . GH10333 


1244 | 52 

i 


1015 


Y02860 


Homo sapiens 


Fregne.nt of human secreted 
protein encoded by gene 6b. 


664 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


IT A 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid sequence SEQ ID NO:37<;- 


2323 


100 


1016 


X6725C 


Rattut 
norvegicus 


n-chamaerin 


l / it 


y * i 

: 


1019 


AF183417 


Home 
sapiens 


microtubule-associatec 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


sex-regulated protein janus-a 


674 


100 


1021 


AF190625 


Coturnix 
co turn i>. 


qdgl-1 


DJO 


96 


1022 


AL133363 


Arabidopsis 
thai i ana 


putative protein 


155 


37 " 


1023 


AB034912 


Homo sapiens 


WD- repeat like sequence 


24 83 


ICO 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammalian inositol 
hexakisphesphate kinase 2 
<IP6K2) mRNA with Ge 


2243 


1C0 


1025 


X69910 


Homo sapiens 


P63 protein 


2956 


95 


1026 


U80736 


Homo sapiens 


CAGFf 


1657 


100 


1027 


AB029333 


Halocynthia 
roretzi 


HrPET-l 


1046 


54 


1026 ! AB032931 


Homo sapiens 


ubiguitin-conjugating enzyme 
i so Jog 


1045 


100 


1025 


G01797 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 5878. 


749 


96 


1030 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5878. 


749 


96 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 | A0222968 


Mus musculus 


L-periaxin 


120 


30 " ' 


1033 { Z81317 

! 


Schizcsaccha 

romyces 

pombe 


DNA2-NAM7 helicase tamily 
protein 


685 


" i 


1034 


Y41515 


Homo sapiens 


Fragment ot human secreted 
protein encoded by gene 75 . 


1321 


99 


103 5 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 " ' " 


43 


1038 1 W74580 


Homo 
sapiens 


Human membrane protein 
BA03O6 . 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiquitin-like prctein 8 


331 


80 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN | IDENTITY 
SCORE 1 


1O4 0 


AF29C204 ' 


Homo sapiens 


biood group carrier molecule 
D0K3 


1637 j 95- 


1043 


Y9673C 


Home 
sapiem 


PR053 9, a Costal -2 homologue. 


162 ) Tc 


2042 


AF140683 


Mus musculus 


F-box protein FWD^ 


2397 


96 


1043 


AF151023 


Homo sapiens 


KSPC189 


1104 


10C 


1044 


AF181631 


Drosoph.il a 
melanoaaster 


BcDNA.GH04.92 5 


204 


3*/ 


1045 


Y77S85 


Homo sapiens 


Human collectin amino acid 
sequence . 


194C 


100 


1046 


A0243S72 


Homo sapiens 


6-phosphogluconolactonasfc 


1317 


100 


1047 


A3035863 


Homo saDjens 


ATP ooecific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


9S 


1046 


AL034550 


Homo sapiens 


doll84F4.2 (novel protein 
similar to nucleolar protein 
4 {N0L4 ) (NOLP) ; 


981 


92 


104° 


AF163825 


Homo ssoiens 


pre-B lymphocyte protein 3 


634 


100 


105C 


AF201949 


Hcrao sapiens 


60S ribosomal protein L3C 
isoloo 


86E 


100 


1051 


AF190624 


Mus musculus 


mcgl-1 


236 j ei 


1052 




Drosophi ia 
meaanogaster 


CG6151 gene product 


160 


44 


1053 


G01193 




Human sprrpfpH nrnlpi n SFO, 
ID NO: 5272. 


646 


96 


1054 


AL162756 


Neisseria 

it ic i j j. i ly j Liuio 


Glu-tRNA(Gln; 

ami rffthrancf cracp cnhitni f A 
auuuut i. ails xci aoc oujjuin l f\ 


681 


44 


1055 


AF131856 


3a t tut 


tRNA eelenocyeteine 

a JL>Ut laLCU pi ULCIi. 


1525 


99 


1056 


U89649 


Chlamydomona 

rp 5 nhil T"rf r* *i i 


Mrl9,000 outer arm dyneir; 


244 


34 

___ 


1057 


AF159142 


Homo sapiens 


breast cancer metastasis 
suppressor J 


663 


\ 5-- 


1056 




"Home ' 

sapient 


protein pemphaxin 


171C 


99 


1059 


AJ270952 


Komo sapiens 


putative membrane protein 


1363 


100 


1050 




nCLCi UUWiJLUO 

rrancisci 


HoxDE 


742 


83 


1061 


X63417 


Homo Rani pnfi 


IRLE 


1037 


100 


1062 




coelicolor 
A3 (2) 




143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10). 


2547 


100 


1064 


AF263614 


Homo sapiens 


acetyl -CoA synthetast 


3493 


99 ■ 


1065 


Y13356 


Homo sapiens 


Amino acid sequence ol 
protein PR0221 . 


1363 


IOC 


1066 


AC006153 


Homo sapiens 


similar to Aquizex aeolicus 
GTP-binding protein; similar 
to AE000771 <PID:g2984252) 


662 


98 


1067 


Y18930 


Sulfolohus 
solfataricus 


hypothetical protein 


162 


29 


1068 


R65969 


Home 

sapiens T98G 


Gl ioblastoma-derivec 
polypeptide . 


887 


100 


1069 


Y07964 


Homo sapienc 


Human oecreted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adlican 


3109 


99 


1072 


US2794 


Mus musculus 


alpha glucosidase 11, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Komo sapiens 


Amino acid sequence of 


1271 


91 
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WO 03/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
ID 

. NO: 


ACCESSION 
NUMBER 


SPECIES 


IJJCOvK J r I i.UJ\ 


SMITH - 
SCORE 


7 nPMTTTV 

X ULW J XI 3. 






protein PR0326. 






1076 


AF1614 5? 




HSPC339 


571 


100 


107? 


Y79509 


Homo ^Ar^i^tis 


Human carbohydrate- associated 
crotein CRRAP- fc 


2151 


96 


1078 


AF223466 


Homo sapiens 


HT015 Drctelr. 


831 


6t 


1079 




thai ians 




o flf; 

COB 


2 9 


1080 


AB024937 


Korr.o sapiens 


LUNX 


1284 


100 1 


1081 


YI4768 


riL/mv^ bo.yj.cno 


V-ATPase G-subunit like 


579 


100 


1 0B2 


AF016416 


Caenorhabdit 
is elecans 


F29A7 , 4 gene produc! 


141 


— 


1 083 


L13291 


Homo sapiens 


TV Pip _ yi "r»*~iov/1 a vet i r\ ■» Y"\*» Vi\ /HvaI 2 cp 
nuir -t j- Lf\j^>y ± a i. y x 1 i-L i ic jiyuiuiasc 


802 


4 5 


2084 


AB041541 


Mas musculus 


unnamed protein product 


1S1 


44 


1085 


G01922 


Homo sapiens 


Human secreted protein, SEQ 

ID jnu : bUUJ . 


202 


97 


1086 


AtsO J 0814 


Homo sapiens 


H-REV107 protein honoiog 


833 


10C 


1087 


AF151636 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 


Y84432 


Home sapiens 


Amino acid sequence ot & 
human RNA-associated 
protein. 


27 83 


100 


J.UC3 


Y94867 


Home 
sapi em 


Human protein clone H?iu563 . 


613 


100 


10 90 


AK023982 


Homo sapiens^ 


unnamed protein product 


130 


4 9 


1091 


AB0415B6 


Mus musculus 


unnamed prctem product 


1103 


81 


1092 


Y71277 


Home sapiens 


Human Zlipc3 protein. 


606 


100 


1093 


U34 973 


Mug musculus 


protein tyrosine phosphatase- 
like 


1131 


9^ 


2094 


Y66677 


Home 
sapienr 


Membrane -bound protein 
PR0828 . 


522 


5€ 


1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53 . 


1029 


9S 


1096 


Y87276 


Homo sapiens 


Human signal peptidt 
containing protein HSPP-53 


863 


9fc 

j 


1097 


AF161455 


Homo sapiens 




74<£ 


96 


1098 


U80029 


Caenorhabdit 
ie elegans 


similar to thiorcdoxin 




^ c 
J 




AJ005066 


Homo sapiens 


Scjv-7-liKe protein 


1321 


9S 


1100 


AJ005866 


Homo sapiens 


Sqv-7-lilce protein 


1118 


go 


1101 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


11 02 


Aul)05obb 


Homo sapiens 


Sqv-7-like protein 


1016 




1103 


AL110244 


Komo sapiens 


hypothetical protein 


299 \ 


33 


1104 


AF242194 


Drosophila 
meienogaster 


brakeless-E 


147 


52 


lint 


MJLAJJlUJLU 


Homo sapiens 


cij h 6<.rzH . 1 iFUJAiivb novel 
protein similar to C. elegans 


966 


1 


1106 


U28C16 


Mus musculus 


parathion hydrolase 
protein 


1624 


87 


11C7 


AJ27815C 


Homo sapiens 


YYI if'ah'l ira "I ■» T> T ^ If "IY1-1QC. 

pULdLlve lipiU JS. _ ii cl S t 


2207 


99 


1108 


G03 733 


Komo sapiens 


Human secreted protein, SEQ 

TTi MO- 7fil4 

i.U IV*-/. /Oil • 


495 


96 


1109 


AF217287 


Drosophila 
melanogaster 




834 


54 


1110 


Y2B921 


Home 
sapiens 


Human regulatory protein 
HRGP-7. 


QA 1 
jl X 


4 6 


1111 


Y2892: 


Home 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


52 


1112 


AF176704 


Homo sapiens 


F-box protein FEX9 


2027 


99 


1113 


AF182076 


Home 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ | 475 


96 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION | SPECIES 
NUMBEK ! 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORr 


IDENTITY 




I 


ID NO: 612C . 






111S 


AF229439 1 Kus musculus 


zinc finger protein 28b 


169'/ 


S3 


1116 


L4 035' 


Homo sapiens ) thyroid receptor interactor 


509 


10c 


1117 


L4 0357 


Homo sapiens j thyroid receptor interactor 


40< 


81: 


me 


A1215^ 


Homo sapiens 


Human X5L cDNA 


167? 


IOC 


111s 


AL161542 


Arabidopsis 
thaliana 


isomerase like protein 


607 


5? 


112c 


AL023754 


Homo sapiens 


dJ272Ll6.1 (Rat 
Ca2+/Calmodulin dependent 
Protein Kinase LIKE protein) 


2341 


91 


1121 


Y5790J 


Homo sapiens 


Human transmembrane protein 
ETMPN-25 . 


321 


3t 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 


4 55 


77 


1123 


AF225418 


Homo sapiens 


lipase 


1533 


97 


Z124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIF . 


3227 


10C 


1125 


AL035690 


Homo sapiens 


dJ202I21.l (novel protein) 


95: 


100 


1126 


AJ000217 


Homo sapiens 


CLIC2 


128f 


99 


1127 


AB030505 


Mus musculus 


UBE-lci 


1069 


79 


1128 


Y73375 


Homo sapiens 


HTRM clone 14 27838 protein 
sequence . 


874 


10C 


1125 


Y78941 


Homo sapiens 


Cvclor>hi-l i n- 1 vr>e neDtidvl 
prolyl cis/ trans isomerase 
amino acid seqvence . 


877 


10c 


1130 


AL023553 


Homo sapiens 


d0347H13.4 (novel protein) 


557 


10c 


1131 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1406 


10c 


1132 


Z68197 


Schizosaccha 

romyces 

porribe 


putative nuclear pore protein 


596 


39 


1133 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 




1134 


AF180681 


Komo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mus musculus 


enhancer ot polycomb 


264 


43 


1136 


M62419 


Mus musculus 


clathrin-aesociated protein 


2189 


99 


1137 


A0006215 


Drosophila 
melanogaster 


clathrin-associated protein 


1254 


7t 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95 . 


440 


98 


1139 


W68104 


Home 
sapiens 


A Rab protean designated 
HRAES-2. 


1065 


95 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339 . 


3979 


96 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product . 


330S 


100 


1142 


Y13402 


Homo sapiens 


Amino acid seguence of 
protein PRO320 . 


1694 


95 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145 


Y12917 


Komo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


10C 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
{PROTEIN DXF3 4 ) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ; 


1233 


100 


1148 


G0254 8 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


3 70 


9B 


1149 


Y73238 


Homo sapiens 


HTRM clone 2019742 protein 
seguence . 


1492 


100 


1150 


W74641 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION | SPECIES 
NUMBER j 
i 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




i 


HEAAK6G . 






J. X -> J- 


AF044201 


Rattut- 


ri^iiTAl TTiPrrViTAnp nrotpln m* 
NMP3S 


157 ( 


9i 


1152 


AF156774 


Homo 


iysophosphatidic acid 
acylt ran sf erase -gamma 1 


16E£ 


99 


1153 


AL118501 


r.unio souicus 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em : AL050069 ) ) 




64 


11 54 


AF131852 


MLJWO hdpirillto 


Unknown 


47/ 


100 


1155 


Y4170t 


Home 
sapiens 


Unmpn PS0352 Drofpin 
seouence . 


136: 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
Tn NO' 8117 


60'. 


9S 


1157 | AK112444 


Lupinus 
luteus 


L-asparaginase 


287 


43 


1158 


AF15184E 


Homo sapiens 


protein 


23? 


3 2 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


244! b 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-l 

..... 


19t. 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein royrtr- jlu r 
SEQ ID NO: 107. 






1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


74e 


83 


1163 


AF113534 


Homo sapiens 


HPI-BP74 protein 


2723 


96 | 


1164 


AF232226 


Danio rerio 


Deddu 


191 


4 1 


1165 


AL.118501 


Homo sapiens 


dJ119lNl6.1 (A novel protein 
(translation of the cDNA 

UK.rZp5o6AUy4 b , tm: AbObUUbj J ) 


1052 


71 


13 66 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DXFZp566A094 6 f Em:AL050069) ) 


94e 


76 


1167 


AF187733 


Homo sapiens 


syntaphilin 


ft IT 




1168 


AB019435 


Homo sapiens 


phosphol ipase 


■ 9Si 


55 
. .. 


1169 


AF064604 


Homo sapiens 


KE03 protein 




j j 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6 . 


1 1 Ql 


100 


1171 


L03188 


Saccharomyce 
s cerevisiae 


putativt. 


18C 


22 


1172 


AF113751 


Mus muscuius 


nuclear pore membrane 
giycoprouexii irKjFixxxj 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


□ u X \)H /. IV i. U . j lUvVcl piUiClu; 


1 2 8 u - 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


28* 


83 


1177 


AC012680 


Arabidopsis 
thai i ana 


putative protein phosphatase 
2C; 55455-56414 


20£ 


37 


1176 


G0134S 


Homo sapiens 

i 


ID NO: 5426 . 


6 9? 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


11BC 


AF039716 


^QCHUl. ISO UUlw 

is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens) 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT- 1994 
Human TCL-1 
polypeptide . 


T cell leukemia/lymphoma 1 


61; 


100 
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TABLE 2 



SEQ 
10 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


\ i 

IDENTITY ■ 

! 






[Homo 
Sapiens 








1183 


U4284: 


Caenorhabdit 
is elegans 


short region of weak | 161 
similarity to collagen ; 


33 


U8S 


A0131613 


Homo sapiens 


dicarboxylate carrier protein 


147C 


95 


1186 


L27645 


Danio rerio 


growth-associated protein 


130 


36 


1187 


Y0273e 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1186 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dO3B036.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


1190 


Xe9602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32328 


Haemophilus 

influenzae 

Rd 


ribosomal protein Sfc 
modification protein . (rimK) 


266 


31 


1192 


AF154831 


Ratcus 
norvegicus 


PV-l 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


916 


100 


1194 


AF026530 


Rattus 
norvegicus 


b \- a. t JlinXil 11AC JJitJl_C-I.il £>pj.J.l'C 
var i anl' T3R^' ' 

VCH.J.CIIIL I\DJ 


1093 


97 


119b 


U35244 


Ratcus 
norvegicus 


vacuolar protein sorting 

* * *— 'l I \\J A uy L ~ v Wo v 


2983 


96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 


1680 


100 


119"/ 


AF157318 


Homo sapiens 


r\U Vi / y> *- \J C C ■*■ 1 i 


912 


47 


119b 


AF125443 


Caenorhabdit 
is elegans 


ront" 1 Ti q ci mi 1 Srifv t" o 
cuutdi no o j_ niJL j cx l. ± u y o . 

pombe phosphatidyl synthase 
(GB:Z28295) 


460 


35 


1199 


AF201934 


Homo sapiens 


DC 12 


1645 


86 


1200 


AL03177S 


Homo sapiens 


OJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


IOC 


1201 


M21103 


Ovis aries 


BIJ.1B4 high- sulfur keratin 


484 


82 


1202 


285986 


Homo sapiens 


dJl08K11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


i 


1203 


U187G2 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


i erkv 


2235 


76 


120b 


AB0C2327 


Homo sapiens 


KIAA0329 


151 


24 


120b 


AB019233 


Arabidopsia 
thaliana 


ubiquinone/menaguinonc 

biosynthesis 

methyl transferase- like 


762 


56 


1207 


AL1 3 6307 


Homo sapiens 


OJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1206 


AF207989 


Homo sapiens 


orphan G-protcin couplec 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
GJ)) 


181 


44 


1210 


U21S49 


Mus musculus 


Ac 3 9/physophi 1 in 


1280 


66 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814 


Mus musculus 


odd- skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
fowleri 


calcineurin E 


222 


35 


1214 


D14849 


Mus musculus 


meiosis-specif ic nuclear 
structural protein l 


1950 


77 


1215 


GO3022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


1216 


Z72S10 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



PCT/US00/34263 



SZQ 
IE 

NO: 


ACCESSION 
NUMBER 


SPECIES j DESCRIPTION 

i 

i 


SMITH- 
WATERMAN 
SCORE 


3 DENT1TY . 






is elegans 


protein {Swiss Prot accession 
yk677hll.5 comes from that 
gene 






1217 


243703 


Saccharcmyce 
s cerevisiae 


unknowi. 


134 


22 I 


1218 


AC013430 


Arabidopsis 
thaliana 


F3F9.1E 


199 


29 


1215 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


£70750 


Caenorbabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes iron, 
this gene 


965 


SB 


1223 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


122* 


AF155100 


Korao sapiens 


zinc finger protein KY-REN-21 
antigen 


2261 


100 


1223 


J 05072 


Bos taurus 


GTP- binding regulatory 
protein gamma- 6 subuniu 


356 


10C 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


ALO50170 


Homo sapiens 


hypothetical protein 


714 


10C 


1226 • 


X64002 


Homo sapiens 


RAP74 


2661 


99 


122*? 


XO4O05 


Homo sapiens 


catalase 


2846 


100 


1226 


AJ005620 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 


1225 


AF045564 


Rattue 
norvegicus 


development- related protein 


1715 


93 


12 3 G 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


123a 


L0B239 


Homo sapiens 


located at OATLl 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabdit 
is elegans 


contains similarity tc 
TR:004595 


744 


31 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C (GB:U20162) 


357 


33 


1236 


Y18101 


Mus musculus 


macrophage actin-associated- 

tyrosine-phosphorylateo 

protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


1236 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEC 
ID NO: 4510 . 


324 


100 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


496 


26 


1243 


X76483 


Gallut 
gallue 


Yes-a6eociated protein 
(G5kDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 { PUTATIVE protein) 


856 


100 


1246 


AJ276O03 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


YS7910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34. 


1365 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

acetylgalact osaminyl t ransf era 
se; similar to Q07537 
(PlD:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein j 


1139 


100 


1250 


Y13148 


Rattus 
norvegicus 


PAG606 


1350 


88 


12S1 


M24 852 


Rattus 
norvegicus 


neuron- specif ic protein PEP- 
19 


124 


46 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1252 


AF146738 


Rattuf 
norvegicus 


testis specific protein. 


771 


SI- 


1253 


G0272B 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6BCG . 


419 ) 97 

i 


1254 


VJ44375 


Homo sapiens 


Human ubiqu it in- conjugating 
enzyme polypeptide. 


1045 | 1 


1255 


AC006536 


Homo sapiens 


BC41195 1 


831 


It 


1256 


AB004316 


Bos taurue 


mitochondrial methionyl - tRNA 
transformylase 


1556 


86 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


S'; 


1258 


Y13362 


Homo sapiens 


Amino acic sequence ot 
protein PR0214 . 


2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RFP transtorming 
protein; similar to P14373 
(PID:gl32517; 


1299 


100 


1260 


ACO05O99 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR {1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp . 


gamma-glutamyl transpeptidase 
(AA 1-568) 


697 


3: 


1263 


AF173871 


Mus musculus 


neuronal PAS 3 


977 


' 94 


1264 


AF178983 


Homo eapieno 


Ras-associated protein Rapl 


433 




1265 

j 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein- 1 (CNAP- 
1) . 


2785 


95 


1266 

! 


Y41738 


Home 
sapient 


Human PROS 41 proteir. 
sequence . 


1622 


ICO 


1267 


AF061346 


Mue musculus 


Edpl protein 


1077 


64 


1268 


U97006 


Caenorhabdit 
is elegans 


C13F10.4 gene product 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Rab3 > 


942 


9£ 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


9f 


1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 


5b 


1272 


AF201933 


Homo sapiens 


DC11 


650 


lOf 


1273 


AF201933 


Homo sapiens 


DC11 


346 


9fc 


1274 


AL02171O 


Arabidopsis 
thaliar.a 


putative protein 


346 


49 ! 

! 


1275 


AC004449 


Homo sapiens 


R3 3683_3 


556 


100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 
HL2AG87, SEQ ID NO: 210. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9) . 


1576 


99 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


476 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


10C 


1280 


AF161380 


Homo sapiens 


HSPC262 


772 


10C 


1281 


Y48610 


Komo sapiens 


Human brea9t tumour 
associated protein 71. 


779 


100 


1282 


AC015446 


Arabidcpsis 
thaliane 


Similar to A1G1 protein 


406 


3b 


1283 


AK024432 


Komo sapiens 


FLJ00022 protein 


403 


3£ 


1284 


W96153 


Homo sapiens 


Human FADD- interacting 
protein (FIP) . 


1825 


81 


1285 


AJ001019 


Homo sapiens 


ring finger protein 


1301 


ioo ■ 


1286 


AE0G3823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AF17B632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Home 
sapiens 


similar to MLN 64; similar to 

138027 <PID:g2!35214} 


1195 


100 


1289 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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SEQ 
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ACCESS "TON 
NUMBER 


SPECIES' 




SMITH - 
WATERMAN 
SCORE 


\ j 

IDENTITY I 
t 


12S1 


273424 


Caenorhabdit 
is eleaans 


C44B9.1 


23 5 


3C 


1292 


Y9487: 


Home 
sapien:- 


Human protein clone KP02551. 


1222 


10C 

i 


1293 


AF13042S 


Homo sapiens 


retinoblastoma -as socia tec 
protein RAP14 0 


489 


25 


1294 


G03856 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7937 . 


538 


99 


1295 


AF133670 


Mus rr.us cuius 


ARl>-6 interacting protein-2 


367 


51 1 


1296 


A0249735 


Homo sapiens 


claudin-6 


1142 


100 


1297 


X5756C 


Escherichia 
col: 


pspE protein 


535 


100 


1298 


AF169284 


Komo sapiens 


LIM and cysteine-rich domains 
Drotein 1 


1997 


10C 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8.5 


324 


25 


1300 


AB024523 


Komo sapiens 


basic kruppei like factor 


1206 


loc i 


1301 


X5S989 




pnq norjhi 1 rationir- rrlatpd 
prot ein 


737 


9S 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


1303 


X52904 


L5LJICI J. l-lllO 

coli 


nrv^n rpartino t t*amp (AA 1— 


359 


ICC 


1304 


U19577 


Cibcuexicjiia 
coli 




242 


93 


1305 


riT toDDU o 


Mus mus cuius 


iMCiJjr protein 


1409 


97 


± o UD 




Hotno sapiens 


Human transmembrane protein 


932 


10C 


1307 


U58750 


Caenorhabdit 


similar to the mitochondrial 

f Ami 1v 
CcUTi-lCtT i. C1IMJL s y 


365 


54 


130B 


AF044774 


Homo sapiens 


breakpoint cluster region 


2681 


99 


1309 


AL078593 


Homo sapiens 


dJ21CBl.l (KIAA0680) 


267 


34 


1 J J. u 


X62693 


T 1 <\mrt c3rvi one 

HOmO Sdpiens 


Lie ctntigcii 


620 


96 


1311 




is elegans 




283 


35 


1312 


AFT 11 ? 1 




t-ixr^iuuoUliie id \j±}i£ii xcauiiiy 

frame 5 


1493 


100 


1313 


Y41763 


sapiens 


Human DPf~)Q ft nrnt" p» l *■ 

sequence . 


1636 


10C 


1 3 14 


API 96 979 




•TM9/1 nrnt pin 
\iv\Ci. pJTOtCI.Il 


2239 


100 


1315 


AF053356 


Homo sapiens 


insulin receptor substrate 


228 


97 


1316 


Y66695 


Homo 
sapi ens 


Membrane -bound protein 
PK01344 - 


1909 


100 


1317 


AF153127 


Gal iuB 
gallus 


SAPK. interacting protein 


2442 


89 


1318 


AF153127 


Gallup 
gallus 


SAPK interacting protein 


1477 


83 


1319 


AF153127 


Gallur 
galluf 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Komo sapiens 


23 kD highly basic protein 


1044 


10C 


1321 


AF174605 


Home 

saplensj 

>Y8308f 

YB30B6 

MAR-2000 28- 

AUG- 1998 F- 

box protein 

FBP-16. 

fKomc 

sapienr 


F-box protein Fbx25 


467 


70 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


2< 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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SMITH- j 
WATERMAN j 
SCORE j 


V 

IDENTITY 






retrovirus 








1324 


AL138655 


Arabidopsis 
thaliana 


putative protei!. 


1174 


1 •' 


132S 


AL138655 


Arabidopsis 
thaliana 


putative proten. 


946 




1326 


AL133215 

1 


Homo sapienE 


bA108L7.2 {novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 - 


1327 


AF161541 ! Homo sapiens 


HSPC056 


1357 


01 


1326 


Y73346 I Homo sapiens 


HTRM clone 6L96S9 protein 
sequence. 


785 


9t 


1329 


L1091C | Homo sapiens 


splicing facto: 


912 


e; 


133C 


AF14 6568 ) Homo sapiens 


MIL1 protein 


1936 


10c 


1331 


K87772 | Homo sapiens 

i 


Human serum glucocort icoic- 
regulated kinase (U-SGK2) 
polypeptide . 


232 


3S 


1332 j Y41741 

t 


Homo 
sapiens 


Human PR0704 protean 
sequence . 


1860 


ICC 


1333 


AF295096 


Homo sapiens 


zinc- finger protein ZBRKj 


411 


9; 


1334 


Z82273 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein Kl?4 comes from 
this gene 


576 


44 


13 35 


AE000810 


Methanobacte 
rium 

thermoautotr 
ophicura 


conserved proteii. 


290 




13 3 6 


Y6 8 779 ' Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-ll . 


1019 


93 


1337 


AB027003 | Mus musculus 


protein phosphatase 


378 


84 


i33e 


U64 856 ! Caenorhabdit 
j is elegans 


weak similarity to TPK 
domains 


215 


40 


1335 


AE001394 | Plasmodium 
I falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76727 i homo sapiens 


MT- 11 protein 


204 


89 


1341 


AC011914 1 Arabidopsis 
I thaliana 


putative mutT protein; 6 83 98- 
67881 


289 


4b 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


10C 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


1344 


AC006 963 j Homo sapiens 

1 


similar to Kelch protelns; 
similar to BAA77027 
(PID:g4650844) 


894 


3S 


1345 


AF257466 \ Homo sapiens 

1 


N-acetylneuraminic acid 
phosphate synthasr 


1880 


99 


1346 


Y25896 | Homo sapiens 

i 
t 


Human secreted protein 
fragment encoded from gene 
64. 


1146 


100 


1347 


AJ272073 | Torpedo 

| rrarmorata 


male sterility protein 2- like 
protein 


1664 


5t 


1346 


AF16154 8 J Homo sapiens 


USPC062 


1018 


9£ 


1349 


W78128 


! Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


! 1117 


100 


13S3 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


ACO0S328 


Homo sapiens 


R26660_l, partial CDS 


870 


74 


1355 


AC024 876 


Caenorhabdit 
is elegans 


contains similarity to 
SW-.RPB1 CRIGR 


829 


t> u 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


2NF234 


3869 


100 


1361 


AL163279 ' 


--.omo sapiens 


homolog to cAMP response 


5035 


99 
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SEQ 
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SPECIES 


DESCRIPTION 


SMITK- 
WATERMAN 
SCORE 


4r 

IDENTITY 




i 


element binding and bete 
transducin family proteins 






1362 


Z4847L 


Homo sapiens 


glucokinase regulator 


ii en 


95 


1362 


24 84 75 


Homo sapiens 


glucokinase regulator 


2682 




1364 


"AF195764 


Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGTl 
protein 


20S5 


99 


1365 


AF1I6605 


Homo sapiens 


FR00915 


581 


100 


1366 


AF11660S 


Homo sapiens 


PRO0915 


~c~q~ 


100 


1367 


AL117352 


Homo sapiens 


dJ876Bl0.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22557)) 


2581 


95 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K4Hnovl5 . 


"1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


4S 


1371 


X05562 


Homo sapiens 


alpha-2 chain precursor (AA - 
25 to 101B) (3416 is 2nd base 
in codon) 


5908 


99 


1372 


25804t 


Homo sapiens 


dJ408N23,4 (novel DnaJ domain 
protein) 


1296 


95 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


1375 


U53445 


Homo sapiens 


DOC1 


1645 


4€ 


1376 


AL117337 


Home 
sapiens 


bA393J16.1 (zinc finger 
protein 33a (KOX 31) ) 


2S0 


60 


1377 


AC00532& 


Homo sapiens 


R2666 0_l, partial CDS 


1126 


10C 


1376 


U35113 


Homo sapiens 


metastasis-associated gene 


1821- 


69 


137S 


LI 53 13 


Caenorhabdit 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


150£ 


10C 


1381 


AB03736C 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKHZN 


959 


91 


1383 


AF237676 


Mus musculue 


G beta-like protein GBL 


1723 


96 


1384 


AF237676 


Mus musculus 


G beta-like protein GBL 


104? 


70 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaRKG-1 . 


715 


100 


13B6 


AF212162 


Homo sapiens 


ninein 


10369 


99 


1387 


AL0316 85 


Homo sapiens 


dJ9G3K23.2 (novel protein) 


337 


33 


1388 


AC004890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA243 80 
>W06316 W06316 03-OCT-1996 
2 7 -APR- 1995 TRP-1 protein. 


542 


86 


13 89 


Ar X o / 3or 


Homo sapiens 


zinc finger protein ZNF223 


266£ 


99 


1390 • 


AvUJD X DU 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


10C 


1391 


AF287894 


Homo sapiens 


PIST 


1410 


97 


13 92 


Ar / b 2. i, o _ 


Homo sapiens 


inner centromere protein 
INCENP 


179^ 


99 


1393 




Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


95 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


95 




G02224 


U'nmn can H pne 


Human secreted protein, SEQ 
ID NO: 6305. 


299 






AC004809 


Arabidopsis 
thaliana 


Similar tc 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


6b 


1399 


AL133396 


Home 

sapiens 1 


dJi068H6 . 4 (prion protein 
like protein doppel) 


OCT 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


P1.11659_5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 
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ACCESSION 
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DESCRIPTION 


SMITH - 


-6 


ID 


XTT 77 T> IT V 

NUMEfch 








TDPN7TTV 


IMU. 








SCORE 




14 C3 


. 


Home 


•nLrnan transferase irtwjsro- ih . 




100 






S3p j ens 








1 

14 04 


X8 1 05 I 


Mus OUSCVJlUS 


_ — 

tex26 1 


— — 

1010 




i a nt 




Mus tnuscuiUS 


J 1 v- 


1 94 


2 9 


14 Ofe 


AB030-: b x 


Kcroo sapiens 


GTPsse activating protein 


*i *> i - 


99 


140 ; 


tv -t m ncot 
AJUlUbeb 


Ka L LllS 


PTB-like protein 




Q Q 






rat tus 


— — __ 






1406 


X7576C 


Drosophila 


LRR4 > 


364 


29 






mel anogaster 








140& 


U7661 f 


Mus rousculus 




N - RAJ 


O Us 


4 8 


14 1C 


AC005E76 


Homo sapiens 


F2 0 887 1, partial CDS 


83 b 


63 


141 j 


AE0002 84 


Escherichia 


ort , hypothetical protein 


3 6 C 


100 






eeli 








141i 


X0156I- 


Escherichia 


Lb trpiE) laa l-i/y) 


91 j 


100 






col i 








1413 


W78271 


Hcmo sapiens 


Fragment of human secreted 


1264 


99 








protein encoded by gene 33 . 






14 14 


AB031051 


Homo sapiens 


orcanic anion transporter 


383^ 


100 








OATP-L 






1415: 


M1746( 


Homo sapiens 


coagulation factor XII 


34 55 


1 00 


1416 


AF097S94 


Homo 


L-kynurenine/alpha- 


220: 


99 






sapiens 


aminoadipate aminotransferase 






1417 


AF251077 


Homo sapiens 


HSPC24:- 


126; 


99 


141fc 


Y0994L 


Rat tus 


putative integral membrane 


1096 


61 






ncrvegicus 


transport protein 






141S 


U13152 


Mesocricetus 


guanine nucleotide-binding 


2175 


76 






aura tus 


protein beta 5 






1421 


AL162456 


Homo sapiens 


bA46SL10.5 (KIAA1176 (novel 


S696 


100 








protein, presumed ortholog 












of mouse K-Cl cotranspcrter 












KCC2)} 






1421 


Y9942< 


Homo sapiens 


Human PRO16 04 (UNQ785) amino 


152 


29 








acid sequence SEQ ID NO: 308. 






142i 


Y9492I- 


Homo sapiens 


Human secreted protein clone 


403S 


99 








qsi4_3 protein sequence SEQ 












ID NO: 52 . 






1423 


AF177386 


Homo 


cancer- ampl if xed 


1074 t 


99 






sapiens 


transcriptional coactivator 












ASC-: 






1424 


Y4 851'/ 


Homo sapiens 


Human breast tumour- 


1851 


99 








associated protein 62 . 






142!: 


AF20884 E 


Homo sapiens 


J3M- 00C 


14 54 




1426 


AF208 84 6 


Homo sapiens 


BK- 006 


853 




142*/ 


AF112886 


3os taurus 


differentiation enhancing 


46 93 


nr 
33 








factor j 






1426 


D4 1387 


Homo sapiens 


Gu protein 


1372 


ci 

Dj 


142£ 


AF161534 


Homo sapiens 


HSFC04ir 


2853 




1430 


AF125043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 




1431 


Y6671B 


Homo 


Membrane -bound protein 


1886 


100 






sapiens 


PRO1106 . 






1432 


AF193613 


Homo sapiens 


cell recognition molecule 


56 8 


inn 








Caspr2 






1433 


AB044560 


Mus mus cuius 


Gliacolir: 


192 


1 A 


1434 


R99900 


Homo sapiens 


NTU-1 nerve protein, 


707 


rn 

Ol 








facilitates regeneration of 












nerve cells. 






1435 


Ar 22U5J 0 


Homo sapiens 


myo- inositol 1-phosphate 




100 








synthase A3 






1436 


X70944 


Homo sapiens 


PT£- associated splicing 


1261 


72 








factor 






1437 


AF271732 


Homo sapiens 


hririaina Intporarnr-I 


1282 


100 


143b 


Y30813 


Homo sapiens 


Human secreted protein 


595 


98" 








encoded from gene 1 . 






1439 


AJ293659 


Homo sapiens 


mucolipidin 


626 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long i so form 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long i so form 


3346 


100 
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ACCESSION 
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SPECIES DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


144; 


AB03S669 


Homo sapiens 1 AL>EX_- 


1944 


100 


144:- 


AF2377H 


Vyosophi Jt & 
melanogaster 


Diablc 


as; 


27 


1444 


AJ012 0 96 


Homo sapaens 


Kefl beta protein 


435 


35 


144: 


X73874 


Homo sapiens 


phosohoryl ase kinase 


6 23 3 


98 


144C 


AF214114 


Homo sapiens 


breast carcinoma- associated 
antigen ECAA 


3995 


95 


14 4'/ 


AF0O3924 


Homo sapiens 


ANC 2 HOI 


2645 


9S 


1441 


AF003136 


Caenorhabdi t 

1 C pi c> ry ^ c; 


contains weak similarity tc 
an AMP-bl nriina mofif 


284:- 


52 


1445- 


AF155112 




NY- REN- 50 antigen 


1184 


85 


1451 


Y95004 




vc54_l, SEQ ID NO: 46. 


S85 


100 ; 

i 


1451- ! AF107203 


Homo sapiens 


ataxin ^-binding protein 


688 


57 j 


14 52 


AF1072O3 


Homo sapiens 


ataxin 2-bindlng protein 


4 56 


78 


1453 


Z38013 


Mus TOVISCUIUS 


L'nK - NST 


882 




1454 


X90566 


Homo sapiens 


t-rotein sequence and 
annotation availabxe soon via 
LABEIT@EMBL- Heidelberg .DE 


510 


28 


1 455. 


AL.03S4 09 


Homo sapiens 


60564M11.3 isimilar to 
sialyltranf erase) 


135fc 


100 


1456* 


D4448C 


Mue musculus 


NATH-2 protein 


272 


100 


145f 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


476 


45 


1455 


AF242552 


Gallus 
gallus 


retinovin 


945} 


34 


146C 


U11036 


Homo sapiens 


Ifcdl 


724 


84 


1463 


AB02S258 


Mus musculus 


granuphilin-a 


54 5 


39 


146; 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2426 


99 


14 63 


AC004997 


Homo sapiens 


match to ESTs Z43975 
(NID:g573097) , R19699 
<NID:g774333) 


869 


98 


1464 


AC004957 


Homo sapiens 


match to ESTs 24397S 
<NID:g573097) , R19699 
(NID:g774333) 


865 


98 


1465. 


U32743 


Haemophilus 
influenzae . 
Rd 


fucose operon protein (fucU) 


315 


50 


146* i Y09C22 


Homo sapiens 


Nct56-like protein 


2341 


100 


146-/ | AC003034 


Homo sapiens 


Homolog of rat kidney - 
specific (KS) gene 


I07i 


95 


1466 } AF071544 


Spinacia 
oleracea 
1 


ribulose-l, 5-bispbosphate 
carboxylase/oxygenase small 

(_• , ^U^m V XT _ Tr\ <^ Vi * » "1 -w — v\ o £ r. v» o c o T 

su-Ouri-ii [* — me t nyj. txcjisi er asc x 


333 


26 


1469 


; Y57930 


Homo sapiens 


Human transmembrane protein 


1053 


100 


147C 


AF032666 


Rattus 


rsec5 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein- 17 (MECHP-17; . 


452 


74 


147; 


AL031033 


Homo sapiens 


C321D2.1 (Riboaomal Large 
Synthase protein) 


1694 


100 


1473 


AF177292 


Komo sapiens 




4026 


98 


1474 


S45936 


Homo sapiens 


HTSI 


1101 


50 


147b 


Y86241 


Homo sapiens 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 

rubripes 


Sand 


1276 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded. for by C. elecans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR-.S221S7) 


846 


44 


1476 


X62447 


Homo sapiens 


PR 264 


543 


61 


1475 


X8220S 


Homo sapiens 




"7116 


100 


1480 


U10S36 


Pan paniscus 


MHC. class 1 A 


675 


84 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPEC'JES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1482 


AL078599 

] 

! 


Homo sapiens 


dJ9SlC6.1 (novel protein 
similar to C. elegam. 
F55A12.9 (Tr : P91066 } ) 


1274 


6r 


1482 


Z56577 


Schizo^ accha 
romyct : 
pombe 


putative vacuolar proteii. 


256 


25 


1483 


AB005662 


Mus mui cuius 


JNK/SAPK-associated protean- 1 


4968 


92 


1484 


AL050120 


Homo capiens 


hypcthetical protein 


716 


100 


148S 


K27876 


Homo sapiens 


DNA binding proceir. 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence oi a 
partial protein kinase. 


575 


9S 


1487 


X84156 


Saccharomyce 
s cerevisiae 


ATH1 


341 


2S 


1488 


AF036963 


Homo s dpi ens 


RNA he li case 


446 


34 


2489 

! 


U56966 


Caenorhabdit 
is eleqans 


coded for by C. elecans cDNA 
y!<30b3.5? coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000999 


Archaecclobu 
s fulcicus 


enoyl-CoA hydra tase (fad-4) 


533 


46 


1491 


M80633 


Rattu: 
norveaj cue 


adenylyl cyclase type IV 


7 07 


95 


| 1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3 513 


99 


1 1493 


Y17220 


Homo oapiens 


Human secreted protein (clone 
f j283-ll) . 


4G2 


37 


1494 


AF133670 


Mus mutculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94697 


Homo 
sapien. 4 


Human protein clone HP10574 . 


1371 


100 


1496 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


AF0374 4 7 


Homo sapiens 


ribosoraal S6 protein kinase 


2427 


10C 


1 i4se 


AL445067 


Thermcpliasma 
acidophiium 


putative target YPL207w of 
the HAP 2 transcriptional 
complex related protein 


269 


35 


.1.4 99 


AB039947 


Homo sap j ens 


XllL-binding protein 53 


227 


36 


1500 


AJ277750 


Homo sapiens 


UBASH3A protein 


3 509 


100 


1501 


AL050333 


Homo 
sapien: 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF17969S 


Homo sapiens 


TALE homeobox protein Me is 2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


1504 

! 


Y530D5 


Homo sapj ens 


Human secreted protein clone 
pn74 9_8 protein sequence SEQ 
ID NO:16 . 


1442 


99 


1505 


X82494 


Homo sapo ens 


fibulin-2 


3580 


99 


1506 


X98296 


Homo sap a ens 


ubiquitin hydrolase 


783 


42 


1507 


AL034548 


Homo sap: ens 


dJ1103G7.6 (novel protein) 


1098 


100 


1508 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


AF220182 


Homo sap. ;i ens 


uncharacterized hypothalamus 
protein HT008 


1181 


96 


1510 


U64601 


Caenorhabdit 
is elecans 


Gene probably begins in the 
next cosroid 


415 


56 


1511 


AL356192 


Neurospcra 
crassa 


related to MDM1 protein 


196 


29 


1512 


D17629 


Homo 
sapien? 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1513 


AF168717 


Homo sapiens 


x 009 protein 






1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidocs is 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattue 
norvegicus 


syntaxin 17 


1374 


90 


| 1517 


AF003140 


Caenorhabdit 
is elecans 


C44E4.5 gene product 


274 


3i 


1518 


AB002584 


Rattus 
ncrvegic\:e 


be t a - a lanine - py ruva t e 
aminotransferase 


2238 


62 


j 151S 


AL121764 


Schizosaccha 


yeast atpl2 protein precursor 


270 


30 
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TABLE 2 



SEQ 
ID 
NG: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITK- 
WATERMAN 
SCORE 


IDENTITY 






romycef 
pombe 


homo log 






1520 


AF255910 


Home 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


20( 


2S22 


D31764 


Homo sapiens 


KIAA0064 


170 


2 < 


1522: 


Y66634 


Home 
sapiens 


Membrane- bound protein 
PRO2 90. 


985 


3 00 


1523 


Y9445D 


Homo sapiens 


Human inflammation 03soci&ted 
protein 


25C 


41- 


1524 


AC000107 


Arabidcpsis 
thaliana 


F17F8.22 


277 


3', 


1525 


AF109377 


Mus musculus 


IdlEp 


1277 


8? 


1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


95 


1527 


Y06135 


Mus musculus 


acid sphingomyelinase- like 
phosphodiesterase 


1496 


75 


i 1528 


AK024423 


Homo sapiens 


FLJ00012 proteir. 


611 


1G0 


! 1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipeptidaoe 


679 


100 


1530 


AF205598 


Homo sapiens 


transposase-like protein 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


5( 


| 1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
H0EAS24 . 


493 


5'/ 


1 2533 


AFC39023 


Homo sepiens 


Ran-GTP binding protein; 
RanBP6 


5707 


95 


1534 


AC007190 


Arabidopsis 
thaliana 


F23N19.S 


374 


37 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


IOC 


1536 


Y36278 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3593 


9S 


1536 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


2532 


AF266756 


Homo sapiens 


cphingosine kinase 


2011 


9£ 


1540 


248604 


Homo sapiens 


OA2 


2238 


10C 


154 i 


AF00Q195 


Ca e norhabdi t 
is eleoans 


Contains similarity to Pfani 
domain: PF30169 (PH) , 
Score=20.6, E-val ue^l . 9e-05 , 
N=l 


379 


42 


2542 


Y71259 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegalin. 


9415 


9r 


2543 


X76092 


Homo sapiens 


DNA binding protein HFX3 


3327 


10C 


2544 


AB02533O 


Homo sapiens 


HRIHFB2 007 


631 


50 


2545 


AF298487 


Homo sapiens 


transcription factor LDP-lb 


2822 


100 


1546 


AF026417 


Caenorhebdit 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo sapiens 


KDEL receptor 


1106 


10C 


2548 


AB035495 


Carassius 
auratue 


ubiquitin- activating enzyme 
El 


836 


42 


2549 


AL021707 


Homo sapiens 


dJ50BI15.4 (KIAA0668) 


3688 


10C 


1550 

i 


AJ2 23978 


Bacillur 
subtiiis 


YvqK protein 


292 


42 


1552 


AF145615 


Drosophila 
melanogaeter 


BcDNA.6H03377 


822 


44 


2552 


AL157734 


Schizcsaccha 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


3*? 


1553 


AF079S27 


Mus musculus 


IER5 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


86 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


99 


1556 


AF1 16553 


Drosophila 
melanogaster 


antennal-specif :c short-chain j 277 
dehydrogenase/reductase | 


32 


1557 


Y72056 


Homo sapiens 


Human membrane transport j 1975 


99 
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TABLE 2 



PCT/rSOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 

sconr 


IDENTITY 








protein, MTRP-2. 






1556 


Y7105C 


Home sapiens 


Human membrane transport 
protein, MTRP-1 . 


1975 


9 *r 


1559 


I710S6 


Heme sapiens 


Human membrane transport 
protein. MTRP-l. 


1894 


9" 


1560 


AF092050 


mus musculus 


beta-1, 3-N- 

acetylglucosaminyl transferase 


262 


44 


1561 


AL109627 


Homo sapiens 


d03 0SK20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (S?AG4) ) } 


1607 


57 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 


1564 


AC0024 00 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding en2yme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216_1 


919 


82 


1566 


AF00019E 


Cacnorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Score=20,6, E-value=l . 9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


1568 


D49473 


Mug musculus 


truncated form of Soxl7 


1047 


It 


1569 


AK02527O 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mc 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-1 


238b 


100 


1572 


AE003831 


Drosophila 
mexar.ogaster 


CG18445 gene product 


180 


31 


1573 


AF0746C3 


Streptomyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdit 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Home sapiens 


transcription factor ICBP90 


287 


66 


1576 


X64B78 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogaster 


Diablo 


421 


54 


1578 


G00975 


Homo oapiens 


Human secreted orotein, SEQ 
ID NO: 5056. 


480 


10C 


1579 


AF248744 


Crypt ospor id 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNT* 
Em:AK00O2l9) ) 


663 


100 


1581 


AF041853 • 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa~ interact 3 ng protein OIP5 


1198 


100 


1583 


AE001803 


Thermo toga 
maritiraa 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch- like 1 protein 


3973 


100 


1585 


AF169675 


Home 
sapiens 


leucine- rich repeat 
transmembrane protein FLRT1 


3 494 


99 


1586 


AF118274 


Homo sapiens 


DNb- 5 


2626 


97 


1587 


X79440 


Homo sapiens 


NADP+~ dependent malic enzyme 


3167 


95 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


3589 


AF1 69803 


Homo sapiens 


f lavohemoprotem b5+b5n 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98_4 . 


181 


47 


1591 


225535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 



SEQ 


ALLLbi) I Cm i 


encpT PC 


i/CCi v. J\ Iri i vyi N 


SMITH - 




ID 










IDENTITY 


MO: 








scoRr 








pombf 




i 




1595 


W78320 


Homo Gapxens 


Fragment of human secreted 


i3ie 


9£ 






protein encoded by gene 81. 






1596 


Y9490C 


Homo sapaens 


Human secreted protean cione 


2236 


S8 








rb649_3 protein sequence SEQ 












ID NO: 18. 






1597 


AF174605 


Homo sapiens 


F-box protein Fbx25 


1406 


95 


1598 


AB032254 


Home 


bromodomain adjacent to zinc 


9676 


se 






sapiens 


finger domain 2k 






1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStaf 5C 


2305 


IOC 


1601 


Y00876 


Home 


Human LAPH-1 protein 


1149 


96 






sapiens 


sequence . 






1602 


AJ22335: 


Homo sapiens 


H3RA- interacting protein 3 


2821 


99 


1603 


AJ22280: 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


ID U*» 






r\£> i ^ t" a 1 erihi tinoitiup 1 "i nnpr 


1601 


c c 


1605 


AF185576 


Mus musculus 


POZ/zinc finoer transcription 


3435 


97 
** 








factor ODA-6 






1606 


AF093744 


Homo sapiens 


unknown 


133 


100 


1607 


A12142 


cvjti t" Yi p» h i r* 




800 


9f 






cons t ruct 








1608 


Y57949 


Homo s Bp iens 


Human transmembrane protein 


1868 


100 








HTMPW-7^ 
ni t irii to* 






1609 


AF151044 


Urjtno Qani ens 




681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 «* 728) 


376S 


IOC 


1611 


Y0820O 


Homo sapiens 


rab Qeranylgeirariy 1 


2976 


100 


1 






cicinsLc tost 






1612 




__ . 

Homo sapiens 




248€ 


99 


lb Xj 


n.\.\JOHH oi 


MidlJ 1 UOp I> 1 5 


nouiii.in-iiKc pxroceiii 


371 


26 






^ hal i ana 

U l AO JL A CI lid 








1614 


v a ocm 

i u y i>u j 


Homo sapiens 


NADH-cytocnrome-cB reauctaee 




IOC 






rlOiTiO a ctp 1 Cflo 




3150 


9-, 


1616 


AJ010750 


Rattut 


Castration induced prostatic 


890 


6i 






norveg i cus 


apoptosifa iej.dt.cti proiciii i; 












\^.irAn JW 






1617 


X58079 


Homo sapiens 


£100 aloha protein 


4 81 


100 


1618 


Y6667B 




Membrane - bound protein 


967 


100 






sap mus 


PRO! 009 






"'619 




li\JA IU O JLV _L V- 1 la 




529 


10C 














1620 


AFl 5073 1 


LT /-n m |-\ earvi OT1Q 




286 


100 


1621 




llrMfin eani one 




4646 


96 


1622 


Ab4 1 / / 


Homo sapiens 


met a Hot hicnem 


3 80 


100 


j o z; J 








240 


36 






s fulgidus 


region AF0859 






T CO/1 






HIX L JlvXJJUx J. a J LaiJL JCJ pi ULCil* 


4 03 


34 


















pombe 








1625 


Y66746 


Komo 


Membrane -bound protein 


1184 


100 






sapiens 


PR01198 . 






1626 


D90053 


Sue scrof a 


destrin 


863 


100 


1627 


Y35954 " 


U nmn Raioipns 


Extended human secreted 


756 


100 








protein sequence, SEQ ID NO. 












203 . 






162B 


AL031775 


Homo sapiens 


dJ3 0M3.2 < novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


286 


68 


1630 


AF017096 


Drosophila 


similar to C. eieoans 


493 


61 






melanoaaster 


R10H10.6 and S. cerevisiae 












YD8419 .03c 






1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidcpsis 


Contains weak similarity to 


143 


38 






thaliana 


GATA-6 DNA- binding protein 












gb|H36l35, gb|Z26200 come 












from this gene. 
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TABLE 2 



GEQ 
ID 
NO: 


NUMBER ' 


SPECIES 


DESCRI FT10N 


SKjITH- 
SCORE 


% 

IDENTITY 


1635 


AF026 24 6 


Homo sapiens 


HERV-E integras* 


413 


90 


1636 


Y5054> 


Homo sapiens 


Human scult brajn cDNA clone 
ve8_I derived protein. 


112*. 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolac acid oxidase 


206t 


99 


1638 


AJ236247 


Kus musculus 1 putative phosphatase subunit 


194t 


96 


1639 


Y9454: 


Homo sapiens 


ruman sccreteu protein exone 
yKzoi. j. piocein sctjuenvc otv 
ID NO : 9C . 


1320 


100 


1640 


AF235030 


Komo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Orosophila 
melanogaster 


WDS 


358 


26 


1642 


Ml 93 5 3 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y704 5* 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


135* 


100 


1644 




Mus musculus 


WD repeat -containing F-box 
protein FBW5 


2671 


88 


164S 


W67816 


Komo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 


115l 


100 


1646 


X6715!:. 


Homo sapiens 


mitotic kinase-like protein-1 


44S< 


99 




rib .5 J.Ol 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


1646 


Y8734; 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


I56fc 


93 


164S 


R95332 


Homo sapiens 


Tumor necrosis iactor 
receptor 1 death domain 
ligand (clone 3TW) . 


413'/ 


100 


lbaU 


* prim i o £ 


Komo sapiens 


Putative map kinase 
interacting kinase 




QQ 


1651 




Homo sapiens 


EpslSR 


4464 


99 


1652 


AL161576 


Arabidopsis 
t hali ana 


putative protein 


134: 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 




Komo sapiens 


dJlB4JS.l {KIAA06G1 protein) 


352 6 


100 


1655 


AL03142e 


Homo sapiens 


dJl84J9.1 (KIAAO601 protein) 


352t 


100 


i O DO 


nun noin 


Dictyosteliu 
m discoideum 


mycM 


297 




1657 


Y28919 


Home 
sapiens 


Human regulatory protein 
HRGP-5 . 


2251 


99 


165B 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 




I T*7 £ SAC 


Arabidopsis 
thaliana 


ubicruitin-specif ic protease 


137 


35 


1660 




Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


181j 


100 


1664 


AF214736 


Unmn oan*i Arte 


2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1668 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


p40 


3 97 


43 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SFECIES 


DESCRIPTION 


SKITK- 
WA7ERMAN 
SCORE 


IDENTITY 






35 i 






1669 


Z 99753 


Schi zosaccha 

roroyces 

pcmhe 


putative NOLI -K0P2- sun family 
nucleolar protein 


56 S- 


47 


1670 


G0313C 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


K96625 


Gallus 
gall-us 


cardiac muscle tensin 


118i 


54 


1672 


AF1 74 4 82 


Homo sapiens 


polycomb 3 


200i 


99 


1673 


Y51B46 


Homo sapiens 


Human 16.1 homclog protein 
fragment . 


233 


29 


1674 


AF2 55334 


Homo sapiens 


EXP3E 


15^ 


25 


1675 


YS4367 


Homo 
sapiens 


Human orctein clone HP1Q563. 


10S 


30 


1676 


Y25712 




Human sprrpfpd nrotpin 

i Lily 1 1 OCL-i C LvU Vvwi.ll 

encoded from gene 2. 


3043 


99 


1677 


Y25712 




Human secreted protein 
encoded t rom gene 2 . 


1581 


91 


1678 


AFI6315I 


Homo sapiens 


dentin sialophosphoprotein 


170 


17 


1679 


AF163151 


V r\mr> c ^ r-i *» ptic 
r.Ul lt\J DdpiClla 


precursor 


17 0 


17 


1680 


AK024453 


Homo sapiens 


rjLnJuuufis protein 


134& 


100 


16 Bl 


AF019236 


Dictyos teiiu 
m di scoideum 


TipD 


613 


34 


1682 


A0243459 


Leisnmania 
major 


_ _ 

prot eophospnoglycan 


153 


26 


1683 


Z6 9365 


Schi zosaccha 
romyces 


putative GTP-binding protein 


560 


46 


1684 


XS4910 




ERp2 6 


1334 


100 


1685 


AP286475 


rubripes 


regulator-like protein 


196 


19 


1686 


PS J 9129B 




vAnmlar ^nrfi no rn*r;t~ p i it T5» 


408'/ 


100 


1687 


Avl-275986 


|4 o.mn ^sni pnc 


t vancrri rih i nn "Faffov 


2956 


100 


1686 


AC275986 


Homo sapiens 


transcription factor 


1886 


88 


1685 


X07311 


me I anoo, aster 


heat shock, protein 


138 


43 


1690 


AF240463 


Rattus 
norveoi cus 


LISl-interacting protein 
NUDE j 


1383 


83 


1691 


AC272078 


Homo sapiens 


APOEEC-1 stimulating protein 


1256 


68 


1692 


A027207S 


Homo sapiens 


APOBEC-l stimulating protein 


1336 


60 


1693 


AF177942 


Xenopus 
laevis 


katanin p6C 


1664 


66 


1694 


AF263 53S 


Homo sapiens 


arginine N~metHyltransf erase 


1774 


100 


1695 


AF2226B9 


Homo 
sapiens 


protein arginine N- 

methyl transferase 1- variant 2 


1182 


81 


1696 


AK000193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AE041035 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


3122 


100 


1698 


AB041035 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


2161 


100 


1699 


AF025772 


Homo sapiens 


C2H2 2inc finger protein 


48B 


54 


1700 


Y44676 


Hcmo eapiens 


Human ARF-Related Protein-1 
(HARP-1) . 


938 


97 


1701 


AKC22407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AEC24574 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AFC55078 


Hcmo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF3S8092 


Kus musculus 


RP42 


1057 


77 


i-jos 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AE036345 


Drosophila 
melanogaster 


aquaporm 


164 


24 


1707 


Y5S927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27I21 


Danio rerio 


G12 


212 


47 


1709 


AL3 91710 


Arabidopsis 


putative protein 


505 


50 
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TABLE 2 



PCT/USOO/34263 



i SEC 
ID 
NO: 


ACCESSION 
NUMRER 


SPECIES DESCRIPTION 

i 

1 


SMITH- 
WATERMAN 
SCORE 


% ; 

IDENTITY j 

1 

1 






thaliana j j 


t 


1710 


B0131j 


Homo sapiens 


Human PRC241 polypeptide. 


1645: 


97 


171- 


U4 075C* 


Mus musculus 


formin binding protein 3C 


4563 


85 


1712 


AJ011116 


Mus musculus 


skeietal muscle and cardiac 
protein 


149C 


89 


1713 


7.F2 55303 


Home 
sapiens 


membrane-associated nucieic 
acid binding protein 


44U 


9S 


1714 


AF255303 


Home 
sapiens 


membrane -associated nucleic 
acid binding protein 


296C 


100 


1715 


UC8227 


Rattus 
norvegicus 


Ras- related protein 


511 


! 


1716 


AF168795 


Rattus 
norvegicus 


schlaf en-4 


112S 


44 


1717 


AF196304 


Homo sapiens 


SUMO-l-specif ic protease 


5804 


99 ! 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


100 


1715 


ABC29333 


Halocynthia 
roretzi 


HrPET-1 


1069 


46 


1720 


AF071317 


Mus musculus 


COPS complex subunit 7h 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1681 


99 , 


1722 


GDI 98 2 


Homo sapiens 


Human secreted protein , SEQ 
ID NO: 6063 . 


718 


100 


1723 


AL032643 


is elegans 


Di.IIIJ.lQA Cv V/lI^liui.aLLCl ^ fcCU 

protein family UPF0034, 


825 


41 

; 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEC 

113 NO* 


586 


92 t 
i 


1725 


Y94441 


Homo 

n i f* n ^ 


111 tfn£* ¥*l Zk/^ "| nAOO> Cr\p f i r 
n U iIRi 1 1 /^UipUoC jJJLLli 1L 

r i utc^ii -L ■ 


1231 


100 1 

i 


1726 


AF25S44 3 


Homo sapiens 


CGI-201 protein 


4397 


99 


1727 


AF103426 


Wrimo canionc 




1810 


99 


1728 


D10884 


DIJS> LdUIUb 


lie UI uLOJ Llli 


1002 


99 


1729 


Z18529 


Gallus 


tensin 


1411 


84 


1730 


Z73423 


is elegans 


cDNA EST EMBL:Z14 908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF0SO891 


Homo sapiens 


PR00105 


470 


30 


1733 


AJ277724 


Homo sapiens 


hi s tone deacetylase 8 


2015 


100 


1734 


G04050 


Ui^inn cani arte 
numu sajJicils 


Human secreted protein, SZQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus mjsculus 


ieucine-rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 


failed axon connections 
protein 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit | 2417 


S3 


1738 


L15314 


Caenorhabdit 


contains similarity to Ptam | 206 
family PF01772 N=l J 


37 


1739 


X54618 


Listeria 
monocy t ocenc 

£ 


phosphadidylinositol specific 
phospholipase C 


134 


27 


1740 


AL031658 


Konio sapiens 


dJ3iOOl3.4 (novel protein 
similar to predicted C. 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173. 


1013 


99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding proteir. 
APD08 . 


1854 


61 


1745 


AF22109B 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSZA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PRO1430 (UNQ736) amino 
acid seguence SEQ ID NO: 116. 


1332 


99 


1747 


Y94 294 


Homo sapiens 


Human coenzyme A- utilising 


842 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES ' 


DESCRIPTION 


SMITH- 
NATERMAN 
SCORE 


IDENTITY 








enzyme CoAEN-2. 






1748 


AK024436 


Homo sapiens 


FLJ00026 proteii. 


1619 


100 


174S- 


AE000877 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved proteu. 


231 


36 

i 

• 


1750 


AF101361 


Drcscphila 
melanocaster 


Abnormal X segregation 


193 


33 


1751 


Y15D67 


Homo sapiens 


ZNF232 


889 


10C 


1752 


AF251038 


Homo sapiens 


GAP- like protein 


822 


100 


1753 


AC003093 


Homo sapiens 


OXYSTEROL- BINDING PROTEIN ; 
45% similarity to P22059 
(PID:gl29308) 


352 


57 


1754 


X69089 


Homo sapiens 


165kD protein 


5703 


99 


1755 


AL049755 


Komo sapiens 


dJ622L5.3 (novel protein) 


1039 


ICO 


1756 


AL0313S3 


Homo sapiens 


CUJ733D15.1 (Zinc- linger 
protein) 


2765 


100 


1757 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide K- 
acetylgalactosamine'! transt era 
se 


2020 


99 


1758 


AL022236 


Homo sapiens 


dJ1042JC10.4 (novel protein) 


776 


43 


175$ 


AF117653 


Homo sapiens 


double homeobox protein 


375 


54 


1760 


Y12065 


Homo sapiens 


hNop56 


2959 


99 


1761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hNop56) 


2595 


99 


1762 


U U £ J ZJ * 


sapient 


OCllC pi.tJU.ULL Wi wU J UUldl 1 Ly 

to dynein beta subunit 


1542 


51 


1763 




nuiuu bopi olio 


£ ormi mi not rans f era.*?t 
cyclodeaminase 


877 


100 


1764 


U9154 1 




n mucin Lot (uiiuj. no c i ciib i. era str 
cyclodeaminase <f ted) protein, 
carboxy- terminal enc 


596 


100 


1765 


AB013365 


Bacillus 

Vial AHnran c 


YlqF 


350 


34 


1766 


Y3 8421 




— - -z : 

Human secreted protein 

p nrndprl bv rip tip Mn "5 f. 


14 5 


71 


1767 


AC009176 


Arabidcpsis 
thaliana 


putative ribulose - D , 5 - 
bi sphospha te 

carboxylase /oxygenase small 
subunit N-methyltransf erase I 


216 


27 


1768 


AK000647 


Homo sapiens 


unnamed protein product 


737 


99 


1769 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


99 


1770 


U73522 


Komo sapiens 


AMSH 


1214 


56 


1771 


U89435 


Kus musculus 


unknown 


829 


86 


1772 


£70011 


Rattus sp . 


tricarboxylate carrier 




95 


1773 


AL03S086 


Homo sapiens 


dJ44A20.2 (novel protein) 


2036 


100 


1774 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ7 85) amino 
acid sequence SEQ ID NO: 308. 


1057 


99 


1775 


AF11033O 


Homo sapiens 


glutaminase 


3146 


100 


1776 


AJ269529 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 


1777 


Z81579 


Caenorbabdit 
is elegans 


cDNA EST yk76£3.5 comes from 
this gene 


232 


31 


1778 


AY007239 


Homo sapiens 


monooxygenase X 


1875 


99 


1779 


AL109608 


Schizosaccha 

romyces 

pcmbe 


oxysterol- binding protein 
family 


644 


38 


1780 


AF254260 


Homo sapiens 


tuftelin 1 


1729 


100 


1781 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 " 


49 


1783 


AK024475 


Homo sapiens 


FLO00068 protein 


4333 


100 


1784 


AX024475 


Komo sapiens 


FLJ00068 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014. 


570 


100 


1786 


S82637 


Homo sapiens 


Ig lambda-like gene/beta- 


247 


100 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH - 


h 


ID 


NUM3ER 






WATERMAK 


IDENTITY 


NO: 








SCORE 










Slucuronidase exon 11 homolog 


I 



TRADOCS: I 4 1 6280. 1 (%C1 40 1 ! . DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* I 


2 


EL0024C 


Receptor tyrosine kinase 
class III proteins. 


BLOC240B 24.70 8.250e- , 
12 157-181 i 




PRO 01 0<- 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 8.06be- 
13 358-361 

j 


4 


BL0002fc 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 9.40Ce- > 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


t 


BL0002? 


Type II fibronectin 
collagen-bindinc domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 ; 
24.31 4.545e-27 353- ! 
3 90 


€ 


RL0O023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BLOOC23 24.31 8.920e- 
33 413-4S0 BL00022 
24.31 4 .545e-27 353- 
3 9C '] 


7 


BL0002i 


Type 11 fibronectin 
collagen-bindinc domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00022 
24.31 4.545e-27 353- 
390 


6 


BL0002? 


Type 11 fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- | 
33 413-450 BL00023 ! 
24 . 31 4 .545e-27 353- 
390 


c 


BL012 60 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 B.llSe- 
09 863-917 


10 


PR00464 


E- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PFO0023 


Ank repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DMOO033 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR0C20B 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.868e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PB00066 13.92 
6.500e-13 505-S18 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


It 


BL0084.S 


CAP-Gly domain proteins. 


EL00845 16.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4 .082e-12 287-329 


23 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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GSO ID NO- 










NO 






23 












binding recion proteins. 




oc; 


DT ftft 1 1 t. 

tJL»yO lib 


Eukaryotic RNA 


oJjOOllbT 8.45 7,273e- 






polymerase II 


^Jf 1208-124/ HLOOllSO 






heptapeptide repeat 


16. Uo 2./7©e-2j. 9b.?- 






proteins . 


383 BLiUUllbY 11 . Bfc 








o.OUUe-1/ lb04-lbbC 
















lb /31-//4 BIjOOIIdH 








1H . 33 J7.352e-J6 463- 








A O C DTftftl1C7\ 1C /I / 

4:7b DliUUllbA ib, 4% 








/.414e-lb 43-8* 




-1 




oJjUUJlbK o . bO b.l2oe- 








14 283-1010 BL00115J 








16.71 S.289e-14 5SI- 








617 BL00115I 8.33 








4.336e-13 535-55C 








BL00115L 12.25 5.939e- 








13 662-694 BL00115G 








11.65 6.0Ile-l3 435- 








463 BL00115K 15.03 








3.417e-10 617-65S 








T5T ft ft 1 1 Crt T. C 1C C DDCa 

bLUullbU lo./o b.bUbe- 








10 863-913 BL0C115P 








11.54 7.538e-10 913- 








9b3 BliDOlabi 38,24 








7.968e-l0 1010-I052 








BL00H5U 10.34 4.475e- 








09 1242-1265 


zt) 


IJT .ftp! /I "5 ft 


Speract receptor repeat 


JBJjOL)420A 20.42 4.109e~ 






proteins domain 


11 81-110 BL00420A 






proteins . 


20.42 8.820e-10 84-113 




tjt An A eft 


RAbosomal protein L23 


BL00050A 23.71 S.250e- 






proteins . 


27 94-127 BL00050P 








14.01 D,l45be-12 133- 








147 


28 


FR00925 


NONHISTONE CHROMOSOMAL 


PR00925B 3.73 3.089e- 






FKUlJilN HMG1 / hAMIJjY 


10 41-54 






SIGNATURE 




23 


PF30756 


Putative esterase. 


PF00756C 14.12 1.108e- 








09 486-516 


jz 


EliOUbD / 


FMN-dependent alpha - 


BLuubb/jJ 17. /o b.Obbe- 






hydroxy acic 


37 274-316 BL00557A 






dehydrogenases proteins. 


35.08 8.909e-29 24-73 








3L00557C 15.59 l.OOOe- 








28 227-257 BL00557E 








21.27 8.898e-22 130- 








169 


34 


PRO 0629 


SHC PKOSPH0TYR0SINE 


PRO0629E 9.90 5.886e- 






INTERACTION DOMAIN 


35 299-328 PRO0b25F 






SIGNATURF 


10.95 8.364e~32 334- 








































■» on a nnno.T) OflO-^ci 








rKUUQz^y JLZ.fib J. / j?c* 










35 


PD01270 


RECEPTOR FC 


PDD1270A 17 5? l OOOe- 






T MM1TWVT AOITT TM TVCPTM 


4ft 3Q_TQ OT>ft1*)"7ftia 








O O ID 0 81^.30 Cid-TTT 








PD01270D 24.66 3.7O0e- 








34 171-207 PD01270C 








19.54 3.455e-30 137- 








166 


36 


PD01270 


RECEPTOR FC 


PD01270A 17.22 l.OOOe- 






IMMUNOGLOBULIN AFFIN. 


40 39-79 PD01270E 








22.18 2.875e-38 94-131 
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SEC ID NO: 


ACCESSION 
NC . 


DESCRIPTION 


RESULTS* 


I 






PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


EL00412 


Neuromodulin (GAP- 43 i 
proteins. 


BL00412C 10.26 9.241e- 
10 264-298 


3E 


BL0C412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412C 10. 22 9.241e- 
10 264-296 


39 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


40 


PR0038O 


K1NESIN HEAVY CHAIN 
SIGNATURE 


PR0038OB 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR0O38OA 14.18 5.154e- 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13.96 2.452e-14 204- 
223 


45 


BL0034 5 


Ets-domain proteins. 


BL00345B 21.26 l.OOOe- 
40 215-266 BLC0345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


Jew OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.536e- 
26 172-202 DMCI551C 
14.62 3.571e-l7 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


PR00876 


NEMATODE METALLOTH 3 ONE IN 
SIGNATURE 


PR00876B 7.66 9«328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


EL00972 


Ubiquitin carboxyl - 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-lC 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BLOC972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-lfi 216- 
234 BL00972E 20.72 
9.471e-14 1016-3038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


"52 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR0O988F 
12.23 7.82Be-l5 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
FR009B8B 11.60 /.yxoz;- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-l9 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


■ i 






FR00762F 15.12 3.100e- 
16 563-563 PR00762B 
12.12 6.063e-l6 23C- 
250 PR0C762E 12.07 
2.286e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-202 


58 


PF00793 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-113S 


59 


PF00793 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-223 


66 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PR0O36O 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51 -64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 10B-110 


73 


BL00239 


Receptor tyrosine kinase 
class 11 proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM004 71 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM004 71A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4 .857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHAT3DYLSER7NE. 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-i2 334- 
351 


81 


PD02876 


DECARBOXYLASE 
FHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.5B8e-l2 393- 
410 


83 


BL00708 


Prolyl enriopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PR00014 


F3BRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI3 KINASE P65 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 j BL00107 


Protein kinases ATP- 
bincing region proteins. 


BLO01O7A 18.39 4.000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOSE/RIB1TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PRC0081A 
10.53 2.500e-12 54-72 


98 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 S.SOOe- 
24 401-423 PR00380D 
S.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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ACCESSION 
NO. 


DESCklPTlOK 


RESULTS* 


10; 


PR00300 


ATP - DE P ENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PRO030OA 9.56 7.545e- 
14 289-306 


104 


BL0047S 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BLC0479B 12 .57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL0O479A 19.66 
4.300e-13 272-295 
BL00479B 12.57 6.294c- 
12 181-197 


10( 


BL0101S 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


10V 


DM01 970 


0 Jew 2K632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 5.000e- 
16 403-416 


10b 


BL001SI 


Cytochrome bS family, 
heme- binding domain 
proteins . 


3L00191K 17.38 4.951e- 
27 238-282 BL001910 
11.37 6.447e-17 182- 
204 


105 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.936e- 
37 8-47 


lie 


BL01136 


Scorpion short toxins 
proteins . 


BL01138A 10.96 S.2S7e- 
10 38-50 


112 


BL00107 


Protein kinases ATP- 
bindino region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins. 


BL002I4B 26.51 l.OOOe- I 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


lie 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 8.560e- 
13 36-67 




PRO 052 S 


GONADOTROPHS RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 ?.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PRO 03 20 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


126 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032K 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8.902e- 
09 379-389 


129 


BL01310 


ATP1G1 J PLM / MAT 8 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


R I BO KINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PRO0990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl-CoA- binding 
protein. 


BL00880 17.52 5.576e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR0021S 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


136 


BL01310 


ATP1G1 / PLM / KATB 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


140 


BL0002S 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
24 214-231 BL00028 
16.07 9.471e-14 102- 
119 BLOO028 16.07 
2.800e-13 18-35 
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NC. 


DESCRIPTION 


RESULTS* 




i 




3L00028 16.07 S.SOOe- 
13 .74-91 BL00026 1 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BLvOvJc 16.07 
6.192e-ll 242-259 
BL00028 16.0/ 4.000e- 
10 158-175 


141 


BL0O5O3 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.535e- 
14 113-133 HL00501C 
9.61 8.688e-10 89-101 


143 


BL0102C 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79--30 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL0C12* 


3' 5 '-cyclic nucleotide 
phosphcdi est erases 
proteins. 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 B.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


1S1 


BL0063 2 


Ribosomal protein S4 
proteine . 


BL00632 23,79 5.271e- 
20 106-149 


154 


BL0055S 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PRO04 4£ 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
23 13-35 


157 


BL004 06 


Actins proteins. 


BL00406D 12.58 2 . 547e- 
18 275-330 BL00406A 
9.95 5.776e-l6 15-50 
BL00406B 5.47 7.4 29e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BLO0I32 


Zinc carboxypeptidases, 
zinc-binding region 1 
proteins . 


BL00132A 26.07 7.000e- 
14 22-63 BL00332C 
21.35 3.466e-12 104- 
145 


165" 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL00362 


Ribosomal protein SI 5 
proteins. 


BL00362 24.67 9.700e- 
15 129-172 


269 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BLO0O39D 21.67 l.OOOe- 
35 640-686 BL00039A 

lO A A T Af/| B 11 111 

16.44 1.9b4e-14 

251 BL00039B 19.19 

BL00039C 15.63 8.773e- 
12 465-489 


175 


PR004 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.72le- 
12 14-36 


178 


BL03 310 


ATP1G1 / PLM / MATf 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 


PD01066 19.43 9.455e- 
36 6-45 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 

i 






BINDING NU. 




isc 


prcooc;* 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14 .16 7.42Se- 
20 160-180 PR0O0O7A 
19.33 4.938e-i9 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6.885e- 
11 238-249 


3.81 


BL00027 


1 K ome obox ' doma i n 
proteins . 


BL00027 25.43 9.526e- 
24 280-323 


182 


BL00027 


'Home obox ' doma i n 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


133 


BL00027 


•Hone obox' domain 
proteins. 


BL00027 26.43 9.52be- 
24 280-323 


184 


BLOO027 


'Home obox' domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00925: 


AT- HOOK- LIKE DOMA IK 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR0092S- 


AT -HOOK- LIKE DOKAIK 
SIGNATURE 


PR03929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.7l4e-17 162- 
177 BL00383E 10.35 

I. 00Oe-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-226 
BL00383D 11.92 7.000e- 
13 295-308 BLC0383B 
7.61 1.692e-ll 187-195 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 5B9- 
602 BL00383B 7.02 
6.000e-OS 479-488 1 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-l3 47-69 


293 


PF00564 


Octicosapeptide repeat 
proteins . 


PF00564B 24.74 6.l64e- 
16 227-278 


194 


PR00503 


BRCMODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 
15 .204-224 PR00503B 
9.96 9.571e-13 170-187 


195 


BL00901 


Cysteine 

synthase/cystathionine 
beta-eynthaee P- 
phosphate att. 


BL00901C 20.63 3.A29e- 
18 67-117 


197 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR0O690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROL1NE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 FR00261E 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 








PR00261D 12.47 7.500e- j 








18 143-165 PR00261E 








14.12 5.065e-I6 65-6? 








PR0026IC 11.37 8.967e- 








16 143-165 PR00261F 








11.57 4.938e-13 143- 








165 PROU^olE 11. Uo 








7.188e-13 6S-87 








PR00261F 11.57 7.188e- 








13 65-87 PR00261E 








11.08 l.b43e-lx 14,3- 








165 


209 


PF00791 


Domain present in ZO-i 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 6.143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00OC7A 19.33 5.731e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PROO0O7C lb. 60 
1.675e-15 201-223 
PR00007D 9.64 7.231e~ 
11 233-244 


212 


BL00183 


Ubiqu it in- conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


213 


BL00183 


Ubiquitin-con3ugatinc 
enzymes proteins . 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL00039 


DEAD- box subfamily ATP- 
dependent heiicases 
proteins . 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BLO0039B 
19.19 4.064e-ll 277- 
303 


217 


BL00100 


Chloramphem col 
acetyl transferase 
proteins . 


BL00100D 17.22 8.484e- 
09 68-10G 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


PR00213C 15.94 3.969e- 
11 199-227 


222 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTH 10NE IN 
SIGNATURE 


PR00675A 5.83 l.OOOe- 
09 901-913 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- 
19 18-39 


22€ 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 1.000c- 
21 21-38 BL00636E 
15.11 8.200e-19 45-66 


225 


PR0030: 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G 
13.78 4.300e-12 361- 
382 


230 


| BL00460 

| 

1 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7.429e-l6 78-96 

12 111-134 BLO046OD 
16.89 8.773e-Il 140- 
160 j 


231 


PR0064 / 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B IKJ.XH V.oZZe- 
09 273-287 






Cyclins proteins . 


X3 LM V V <i J 6 D 6u • J 1 / - t «u -> c; 

27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR0044S 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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SEQ ID NO: 


ACCESSION 


DboCRIPi ION 


RESULTS* 




NO. 












17.27 4.462e-12 47-7C 








PPftfld 4 41") 10 7C 7 1O0p- 


i 






11 109-123 




PR0001S 


LEliCINE-RICh REPE.hI 


PR00019B 11.36 7.30Ce- 






SIGNATURE 


10 251-265 PR00019E 








11.36 5.320e-09 119- 








PB0O019B 11 3£ 








T nQ TOO "5 /! 7 

l.OOOe-Oo 2<£?-/lJ 


2 36 




LEUCINE- RICH REPEAT 


PRO0O1QR 11 36 7 300c- 
















11.36 5.320e-0S 113- 








J.Z / rKUvUiJA ii . jc 




— 




1.000e-08 223-237 


237 


FD00289 


rKUJ.fc.llN oni UUMAilv 


rD0Q2o9 y . i> / o.44oe-iJl? | 


1 






67-81 


240 


cunnm 1 

rKU UU11 


i IrL 111 tAjr ~ kiX ftt 












241 


FR00011 


TYPE III EGF-LIKE 


PR00011D 14.03 3.492e- 






c t r:\72iTTn? r 

O XKjvinl UKt 


l u bio- dj- 


244 


BL00903 


Cytidir.e anc 


BL0C903 12.93 8.941e- 






deoxycytidylate 


12 54-64 






deaminases zinc-binding 








region s . 




245 


DM00179 


w KINASE ALPHA ADHESION 


DM00179 13.97 8.0436- 






T-CELL . 


09 124-134 


248 


BL0024 6 


Wnt-l tamily proteins. 


BL00246D 23.97 l.OOOe- 






40 166-239 BL00246E 








20.32 1.000e-40 305- 








351 BL00246B 13.69 








4.176e-36 1C5-140 








BL00246A 15.75 2.286e- 








^ A T rt OA D7 Art Til ^ /"* 

24 70-90 BLCJUzlbC 








15.56 4.857e-22 150- 








175 


250 


PR00927 


ADENINE NUCLEOTIDE 


PR0OS27E 14.93 5.114e- 






iKANibutAlvh J. bittNAiuKli 


10 /OJ - Z f O 


254 


BL00G74 


AAA-protein tamily 


BL00674B 4.46 l.OOOe- 






proteins . 


09 223-245 


255 


PDC17 96 


PROTEIN TRANSMEMBRANE 


PD017S6 15.01 6.045e- 






COBALT ZINC CADMIU. 


09 61-88 


256 


BL50002 


Src homology 3 (SH3) 


BL50002B 15.18 2.800e- 






domain proteins profile. 


10 421-435 


259 


PR0OO94 


ADENYLATE KINASE 


PR00094C 12.94 2.2D0e- 






SIGNATURE 


18 87-104 Pn00094D 








12.52 2.731e-14 161- 








177. PR00094A 20.31 








5.500e-14 lJL-2r 








PR00094B 22.01 4.115e- 








13 39-54 PR000i>4fc. 








12.25 7.333e-13 178- 








193 


259 


BW0892 


HIT family proteins . 


BL00892A 18.17 5.500e- 








13 60-91 






Proteasome A~type 


r)±iuvjoct\ £ j - -it j . uuve 






subunits proteins. 


40 8-54 BL00386B 








31.38 J.o64e-3J eo-lUo 








BLOOJoolJ Zu. /l J.uuue- 








71 1*5^-184 HL0G388C 








IB. /U H.14/e-Xfo ±2b- 








148 


264 


BL00903 


Cytidine anc 


BL00903 12.93 5.821e- 






deoxycyt idyl ate 


09 91-103 






deaminases zinc-binding 








region s. 




267 


BL00107 


Protein kinases ATP- 


BL00107B 13.31 1.529e- 






binding region proteins. 


09 241-2S7 


270 


BL00226 


Intermediate filaments 


BL00226D 19.10 l.OOOe- 






proteins. 


37 362-409 3L00226B 



199 
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WO 01/5331? 
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NO. 












23.86 6.043e-3S 196- 
244 BL00226C 13.23 
7 OOOp-20 

BL00226A 12.77 6.143e- 

15 96-n: 


271 ■ 


PD02952 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI. 


PD02952C 15.76 9.731e- 
"6 235-265 PD02952 n 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929E 
18.36 8.800e-17 179- 
199 


274 


BL01027 


Glycosyl hydrolases 


BL01027B 15.34 3.486e- 


275 


PR00424 


ADENOSINE RECEPTOR 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BLCO0S2 


Kiboscmal protein S7 
proteins . 


BL00052A 27.85 6.000e- 
13 137-184 BL00052B 
15.17 S.I43e-12 20B- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790N 13.25 5.659e- 
13 267-294 


230 


PR00319 


BETA G- PROTEIN 
(TRANSDUC3N) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13,41 1.000e-21 89-105 
PR00319A 15.27 8.364e- 
21 51-60 PR00319B 
11.47 8.200e-19 70-85 


281 


PR00319 


BETA G~ PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR0031SD 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR0031SA 15.27 o.364e- 
21 38-55 PRO0319B 


287 


PF00929 


ExonucieasB. 


PF00929D 16.17 7.3S6e- 

no 1 ^ Q_ 


291 


BL00326 


Tr opomyos i ns prot e i ns . 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL- 3INDI . 


PD00066 13.92 fi.714e- 
12 203-216 


295 


BL0002B 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 5.50De- 
15 322-339 BL00028 
16.07 9.471e-14 433- 
450 BL00028 16.07 
4 .600e-13 648-665 
EL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL0G028 
16.07 8.435e-l2 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16,07 S.l54e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
EL0002B 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 6/6- 
693 BL00028 16.07 
9.654e-ll 564-583 
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SSQ ID NO: 


ACCESSION 
NC. 


DESCRIPTION 


RESULTS- 








BL00028 16 .07 4.0B6e- 
09 517-534 BLO0028 
16.07 7.429e-09 4BS- 
506 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


EL00215A 15.82 B.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PF00953 


Glycosyi transferase. 


PF00953C 19 .70 8 .773e- 
34 236-269 PF00953A 
19.68 5.0006-25 102- 
129 PF00953B 6.17 
I.000e-13 182-194 


304 


PP00152 


tRNA synthetases class 
ia . 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.250e-21 220- 
257 PF00152B 15.67 
2.6S8e-13 159-184 
PF00152A 19.66 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 8,250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


309 


PR00237 


RKODOPSIN-LIKE GPCR 
SUPER FAMILY SIGNATURE 


PR00237E 13.03 5.091e- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DN A polymerase family X 
proteins . 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14 .90 2 .310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
EL00522E~19.63 8.615e- 
14 430-460 BL0052.2B 
27.30 9.625e-12 267- 
313 


31C _ 


3L00326 


Tropomyosins proteins. 


BLO0326D 8.76 5.23Se- 
10 856-897 


312 


3LO029O 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL0029CA 20.89 4.706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 


BL00345 


Ets-domair. proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.2I7e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
35 63-76 


317 


BL0102Q 


SARI family proteins. 


BL0102OC 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


EL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814€- 
10 216-235 
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1 SBC ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RZSVhTS* 






SIGNATURE 




321 


BL00027 


'Komecbox' domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.7D5e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
EL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
EL00232B 32.79 5.965e- 
18 370-41B BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.3B4e-15 475-523 
BL0D232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


fc'JS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins . 


EL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.6S 7,167er 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
6.855e-09 38-5C 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PO01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat prcteins. 


EL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 

BINDING NU . 


FD01066 19.43 2.400e- 
30 16-55 


343 


Dr400031 


IMMUNOGLOBULIN V REGION. 


DK00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12-27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SEU ID NO: 


ACCESSION 
NO. 


DESCRIPTION " 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


3L0I187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins. 


BL01187B 12. D4 1.783e- 
13 100-116 BLO 11871: 
12.04 8.435e-13 276- 
99? RL.011S7R 19 04 

8.800e-ll 13-29 
BL01167B 12.04 7.429e- 
10 54-70 BL011875 1 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 255-267 


352 


PDOO078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR 


PD00078B 13.14 5.950e- 
10 366-379 PD000783 
23.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins . 


BL00380F 9.76 6.694e- 
11 542-553 


355 


PFO0626 


PHD- finger . 


PF00628 15.84 l.OOOe- 
11 116-131 


1C£ 
J3t 


rKUUjO / 


bUPlAlVJ^ l/i 1 IW SDV.tr .Uft 

TYPE 1 SIGNATURE 


IrKUUDo /n O ,UD 3. /UUc 

09 17-37 


355 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PD00O66 13.92 4.462e- 
15 261-274 FD0006S 
13.92 6.500e-13 233- 
246 PDO0O66 13.92 
4 .300e-09 289-302 


361 


PF0C791 


Domain present in ZO-1 
and UncS-like net r in 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1 .095e-12 21-76 
PF00791A ^17.85 1.4J2e- 
09 71-126 PFO0791B 
28.49 7.440e-09 184- 
239 


362 


PFOO / 9 j. 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


Pr007911i 28.4^7 z.273e- 
11 279-334 


363 




rCE.LUVhKlN rAMlljy 
SIGNATURE 


i'KUUflSU^- A^i . 3.UOUO- 

10 73-95 PR00450C 
12.22 3.270e-O9 109- 

X w J. 


364 


PF00242 


DNA polymerase (viral) 
w-Lcrininax uuitiain 
proteins. 


PF00242Q 13.51 2.328e- 
09 22-68 


365 


PF00242 


DNA polymerase {viral} 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


jpljV J. J.OU 


Kinesin ldcjlit chain 
repeat proteins. 


SJJV J J- O I/O JL .7 . .J *3 t> # UT 1c 

09 1038-1092 


367 


PR0001S 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


366 


proooi: 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 9.000e- 
15 30-49 PR00011A 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


n vctin jjijuo^uo toot \~ 

proteins. 


3L01032H 11.25 4.1£0e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 2.739e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTTOK 


RESULTS* 




i 


10 88-118 | 


380 


BL00107 


Protein kinases AT? - 
binding region proteins. 


EL0O107A 16.39 J.OOOe- ' 
23 276-307 BL00107E 
13.31 1.692e-12 342- 
358 


381 


BL004 55 


Putative AMP- binding 
domain proteins. 


EL00455 13.31 S.714e- j 
12 50-66 j 


382 


PR00624 


HISTONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00076 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- , 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


38S 


FR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR INTFRT.EUK1N-1 
PRECURSOR. 


PD02870B 18.83 6.000e- 
10 97-130 


388 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 B.OOOe- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL002SOA 20.89 7.6S7e- 
09 151-174 


390 


BL0021S 


Mitochondrial energv 
transfer proteins. 


BL00215A 15.82 5.2D0e- 
15 222-246 BL0021SA 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B . 
10.44 e.500e~09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PROO048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 E.579e- 
11 141-155 


396 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e~ 
09 55-74 


399 


BLOC240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00676B 24.71 8.07le- 
18 331-369 PF00676D 
14.40 3.854e-15 406- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
EL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


PF00992A 16.67 5-974e- 
09 105-140 


4 04 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


FR00019B 11.36 1.4S0e- 
10 73-87 PR0001SA 
11.19 B.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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RESULTS* 








294 BL00232B 32.79 
9.384e-15 463-51; 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4 .326e-ll 356- 
374 BL00232C 10.6ir 
7.26le-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 




PP00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 S.634e- 
09 902-940 


409 


BL01160 


Kinesin licht chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL.00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BLOOB66 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.25 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


41.8 


PR00239 


MOLLUSCAN RHOD0PSIN C- 
TERM1NAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.9S5e- 
14 23-78 PFO0791E 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791P 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.96 
5.235e-09 381-420 
PF0D791B 28.49 6.202e- 
09 189-244 PF00791E 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7 .-207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL00518 


Zinc finger. C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


432 


BL00C39 


DEAD -box subfamily ATP- 
dependent he li cases 
proteins. 


BL00039D 21.67 l.B44e- 
34 490-536 BL00039A 
18.44 5.625e-19 205>- 
244 BL00O39B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR004 52B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORM IN SIGNATURE 


PR00828B 5.23 8.2l8e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL0041SN 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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P15. 


10 182-218 PF0I140D 
15.54 3.093e-09 246- 
281 


449 


PR0056S 


dopamine d3 receptor 
signature: 


PRC05G6G 13 .95 5 .551e- 
09 39-5; 


451 


PF00084 


Sushi domain proteins 
(SCR repeat proteins. 


PF00084B 9.45 3.8l3e- 
10 47-59 


452 


BL0079C 


Receptor tyrosine kinase 
class V proteins . 


8LC07S0I 20.01 2.£21e- 
09 6ie-649 


456 


PR0C380 


KINSSIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14 .18 1 .000e- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12 .64 4 .724e-16 194- 
212 


457 


PR00253 


GAMMA- AM J MOBUTYRI C ACID 
(GAB A) RECEPTOR 
SIGNATURE 


PR00253A 9.15 2.143e- 
24 246-267 PR00253B 
13.47 2.0O0e-23 272- 
294 FR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


467 


PR00649 


GLVCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00845D 9.77 S.236e- 
09 910-937 


471 


BL00678 


Trp-Asp i WD) repeat 
proteins proteins . 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermeci ate filaments 
proteinc . 


BL00226E 23.86 3 . 721e- 
09 282-230 


473 


BL00344 


GATA-type zinc finger 
domain proteins. 


BL00344 17.99 7.000e~ 
12 814-852 


474 


BL00481 


Thiol -act iva ted 
cytolysins proteins. 


BL00481E 13.07 6 . 909e- 
09 173-199 


4 79 


PRO 03 19 


BETA G-PROTEIN 
(TRANSDUC1N) SIGNATURE 


PR00319B 11.47 2.571e- 
09 395-408 


480 


PD01066 


PROTEIN 2, INC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 1.90Oe- 
38 8-47 




PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 452-4 73 PR004 05B 
11.83 4.333e-lB 430- 
448 PR0040SA 17.71 
4 .971e-I6 411-43: 


482 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 955-974 PR00049D 
0.00 9.857e-10 958-973 
PR0004 9D 0.00 1.3 05e- 
09 937-952 PR00049D 
0.00 8.322e-05 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8.615e- 
23 653-673 PRO00O7A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e~I? 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e- 
C9 200-214 


486 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.e82e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR0004SD 0.00 7.864e- 
09 66-3-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6-464e- 
17 58-92 


497 


PF00429 


ENV polyprotein I coat 


PF00429 31.08 7.17le- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS'* 






polyprotein) . 


15 21-71 


496 


BL00120 


Lipases, senm. i 3L00120B 11.37 7.923e- 
proteins. ! 09 185-20C 


500 


EL00030 


Eukaryotiic RNA-binding 
region RNP-i proteins. 


BL00030A 14.39 7.353e- 
11 299-316 


501 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


EL00021B 13.33 3.?39e- 
17 492-510 


508 


PR00120 


H+ TRANS PORTING ATFASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.£00e~ 
19 705-722 


509 


DM01 4 17^ 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 2U.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-37C 


511 


PF00534 


Glycosyl transferases 
group 1 . 


FF00534B 14 .47 6.625e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 1.000e-40 181- 
222 PD01841D 17.87 
1 .000e-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-382 PD01841G 
24.26 1.000e-40 386- 
440 PD01841L 18.42 
1 .O00e-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.7S0e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-886 
PD01841H 21.30 2 . 909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
554 PD01841C 13.78 

PD01841K 10.82 8.594e- 
91 10^4-1077 pnni flan 

23.00 2.667e-13 549- 
591 


514 


PR00153 


CYCLOPHIL1N PEPT1DYL- 
PROLYL CIS-TRANS 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


3L00740 


MAM domain proteins. 


EL00740A 13.87 7.l88e- 
12 410-423 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins . 


EL00242C 16.86 B.320e- 
C9 12-42 


523 


DM00031 


IMMUNOGLOBULIN V REGION. - 


DM00031A 16.80 3.7S0e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


EL00319C 17.12 8.375e- 
10 61-95 


526 


PF007B9 


Domain present in 
ubiqu i tin- reeu 1 a tor y 
proteins. 


PF00789B 19.70 3.306e- 
12 322-343 PF00789C 
20.98 S.269e-09 367- 

392 


528 


BL01162 


Ouinone oxidoreductase / 

zeta-crystallin 

proteins. 


BL01162C 22.80 1.500e- 
16 120-164 
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SEO ZD NO: 


ACCESSION 


DESCRIPTIOK 


RESULT?" 




NO. 






52S 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 


PR00910A 2.51 3.893e- 




1 SIGNATURE 


09 60-7? 


532 


EL0021S 


Mitochondrial energy 


BL00215A 15.62 4.000e- 






transfer proteins. 


17 11-36 BL0O215A 








15.82 8.660e-ll 123- 








14 8 


533 


BL00215 


Mitochondrial energy 


3L00225A 15.82 4 . 000e- 






transfer proteins. 


17 11-36 BL00215A 








15.82 e.660e-ll 97-122 


534 


BL0009B 


Thiolases acyl-en2yme 


BL00098C 21.65 2.800e- 






intermediate proteins. 


38 181-227 BL00098B 








32.59 5.345e-38 86-141 








BLO0OS8D 26.30 8.364e- 








35 245-286 BL00098E 








22.12 1.000e-34 314- 








352 3L00098F 10.18 








4.971e-22 365-386 








BL00098A 10.60 6.455e- 








11 38-50 


535 


PR00370 


FLAV I N - CONTAINING 


PRO0370E 11.96 7.429e- 






MONOOXYGENASE (FMO) 


22 321-340 PR00370D 






SIGNATURE 


16.33 6.143e-21 185- 








204 FR00370F 17.75 








6.559e-21 376-396 








PR00370B 10.91 9.591e- 








21 27-46 PR00370C 








12.72 3.500e-20 140- 








157 PR00370A 3.35 








6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 


BL00028 16.07 7.429e- 






domain proteins. 


16 285-302 BL00026 








16.07 6.294e-14 341- 








35B BL00028 16.07 








1.346e-ll 369-386 








BLOC028 16.07 1.692e- 








11 397-414 BL00028 








16.07 4 .4S2e-ll 453- 








470 BL00028 16.07 








7.23le~ll 425-442 


t 






BL00028 16.07 4.300e- 








10 313-330 


537 


BL0D762 


WHEP-TRS domain 


BL00762A 23.43 9.419e- 






proteins . 


15 844-881 


538 


BL00762 


WHEP-TRS domain 


BL00762A 23.43 9.419e- 






proteins . 


15 819-856 


539 


BL00762 


WHEP-TRS domain 


BL00762A 23.43 9.419e« 






proteins . 


15 822-859 


54 0 


PR00985 


LEUCYL-TRNA SYNTHETASE 


PR00985A 12.10 9.000e- 






SIGNATURE 


10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 


PD02102A 16.74 l.OOOe- 






VACUOLAR ATP SYNTHASE 


40 3-47 PD02102B 






HYDROL . 


28.28 4.375e-34 57-100 








PD02102D 21.69 1.923e- 








30 179-218 PD02102C 








26.34 6.929e-26 100- 








146 


542 


EL00028 


Zinc finaei, C2H2 type, 


BL00028 16.07 l.OOOe- 






domain proteinB. 


10 48-65 BL00028 








16.07 6.400e-10 193- 








210 BL00028 16.07 








1.000e-09 343-360 








BL00028 16.07 6.914e- 








09 78-9 c 


545 


BL00250 


TGF-beta family 


BL00250A 21.24 8.000e- 






proteins . 


31 293-329 BL00250B 








27.37 5.266e-24 354- 








390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDOCIN) SIGNATURE 


09 106-202 PR00319A j 
15.27 7.3446-09 210- 

227 j 


546 


BL01204 


NF-kappa-B/Rel/dorsal 
domain proteins. 


RL01204A 17.74 l.OOOe- 
40 8-S6 BL01204D 

T C A*) 1 OHrto A f\ 1 
Ib.'idi X,JUUQ-H.V i. / /- 

221 BL01204E 13.83 

BL01204C 13.93 8.714e- 
22 141-260 BL01204B 
15.41 4.333e-I6 102- 
116 


54S 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 


553 


PF00632 


HECT-domain (ubiquitin- 
transf erase) . 


PF00632C 20.66 3.302e- 
18.45 3.700e-21 1515- 


554 


BL00290 


Immunoglobulins and 
major histocompatibxlity 
complex proteins. 


BL00290B 13.17 1.600e- 

20.89 2.059e-l4 130- 
153 






DDfYt TMl'.OTPU DDMTPTM ~- 

rKUijilNC.- til Ln FKUi&J.f» j . 


09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANS FORMING 6 IK Purl . 


DM01211L 11.93 3.762e- 
09 7-35 


562 


PP006S8 


Poly-arienylate binding 
protein, unique domain 
proteins. 


PF00658C 16.33 9.455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 4 72-488 


566 


PF00655 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.977e- 
13 22S-268 


569 


BL00107 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-l5 183- 
199 


570 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
195 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2. 63be-.il 
252 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 

267 PR00193B 11.69 

PR00193A 15.41 2.588e- 

V- X XZj J. -J — > rlvUu^ J J*i 

19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
20 585-925 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BL00116 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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TV ^■*/">c«o<? 1 
ACCbSblOfv 

NC. 




Orel M TC * 






proteins . . 


13 864-877 BL00116I: 
11.82 1.529e-12 952- 
965 


578 


BL00195 


Giutaredoxin proteins. 


BL00195B 15.31 7.1566- 
09 121-141 


579 




SIGNATURE 


11 217-231 PR00019P 
11.36 1.350e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 




vaAMrlft-AMlNUoU i zKl v- rtLJ.1; 

<GABA) RECEPTOR 
SIGNATURE 


25 275-296 PR00253E 
13.47 7.923e~24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.2B6e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI 2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.87Se- 
37 79-126 DM01537B 
21.53 9.49le-30 916- 
963 DM01537A 15.14 
3.196e-ll 784-804 


586 


PFC0013 


KH domain proteins 
family of RNA bindinc 
proteins . 


PF00013 5.78 1.450e-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


DMO0892C 23.55 4.409e- 
23 2o^-25<> 


589 


BL004 78 


LIM domain proteins. 


BL0047BB 14.79 1.643e- 

13 261-2/6 BL0047oi3 

14 .79 7.709e-09 321- 

"X 1 c 
J Jo 


590 


PF00855 


PWWP domain proteins. 


PF00855 13.75 8.000e- 
15 931-948 


591 


PF00855 


PWWP domain proteins . 


PF00855 13.75 B.OOOe- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


PF00628 15.84 3.455e- 
12 424-439 


594 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 2.241e- 
16 558-576 PR00205A 

14 . /J y.JUUe-13 3QsZ~ 

558 PR00205C 13.65 
5 ^04p-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteino. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U2 . 


PD01675C 19.89 2.330e- 
10 55-39 


600 


BL00242 


Integrins alpha chain 
proteins. 


BL00242E 9.03 9.591e- 
27 98S-1014 BL00242C 

316 BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* j 








5.00Ge-ll 61-73 
BL00242D 13.57 4.9fifc>e- 
10 291-316 


6C1 


FR0032C 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320A 16.74 5.610e- 
09 198-213 


602 


PR00276 


PANCREATIC HORMONE 
SIGNATURE 


PRO0278A 12.43 4.56Se- 
10 331-348 


603 


BL00475 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BL0031S 


Dehydrins proteins. 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL0041S 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00S26F 17.75 l.OOOe- 
13 335-35B 


608 


PF0085S 


PWWP domain proteins. 


PF00855 13.75 5.l67e« 
15 265-282 


609 


PF00855 


PWWP domain proteins. 


PF0O855 13.75 5.167e^ 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DM01206B 10.69 7.411e- 
10 877-897 DM01206P 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206P 
10.69 1.797e-09 875- 
899 DM01206B 10.69 
4.076e-O9 865-885 
DM01206B 10.69 7.038t- 
09 898-918 DM01206B- 
10.69 7.949e-09 871- 
891 DM01206B 10.65 
8.29le-09 767-787 


615 


PD0269S 


PROTEIN DNA-E1ND1NG 
BINDING DNA. 


PD02699A 8.31 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-l7 158-182 


616 


PR00380 


KJNESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00330C 
13.18 2.976e-13 436- 
455' 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086c- 
22 288-310 PR00380D 
9.93 3.72le-17 486-50E 
PR00380B 12.64 2.241e~ 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 


618 


DM01206 


CCRONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM012C6B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR007C0B 16.80 3.160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC KOLYBDOPTERIN 
DOMAIN SIGNATURE 


PRO0407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NADH 
dehydrogenase 75 Kc 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


descriftion 


RESULTS* 






subunit proteins. 


24 .37 1 .000e-40 255- 
308 BL00641F 33 .12 
1.000e-40 571-623 
BL00641A 17 .15 I .818e- 
37 48-80 BLOC641B 
12.62 5.846e-34 113- 
139 BL00641D 12 .23 
9.308e-29 216-240 


627 


PR00103 


CAMP- DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR001O3E 17.80 2.500e- 
18 367-380 PR00203B 
13.39 2.080e-14 297- 
312 PR00103A S.59 
2.957e-14 282-297 
PR00103D 10.83 3 .077e- 
12 346-358 PRCC103C 
15.68 1.000e-ll 334- 
344 PR00103B 12 .39 
1.450e-ll 175-19G 
PR00103A 9.59 3.720e- 
10 160-175 


63 0 


PROOOfll 


GLUCOSE/RIB I TOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00061A 10.53 6.211e- 
16 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 e.SOOe- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM01206B 10.69 2.233e- 
10 1324-1344 CM01206B 
10.69 4.822e-10 1276- 
1296 DMC1206B 3 0.69 
7.658e-lC 1328-2348 
DM01206B 10.69 8.274e- 
10 12 80-13 00 DM01206B 
10.69 4.532e-0S 1320- 
1340 DM02206B 10.69 
7.266e-09 1326-1346 


635 


3L00107 


Protein kinases ATP- 
binding recion proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BLOOD 07B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins . 


BL00657A 19.39 1 .545e- 
30 101-143 BL00657B 
22.27 7.750e-26 349- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 2.000e- 
10-607-623 


643 


BLO0018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD-f inger . 


PF0O628 15.84 2.250e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
47S 


648 


BL01129 


Hypothetic^} 
yabO/yceC/sfhB family 
proteinG . 


BL01129E 13.25 4 . 000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BIi01129B 12.51 
6.118e-13 191-222 


649 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.S08e- 
10 455-480 


650 


BL00027 


1 Homeobox 1 doma i n 
proteins. 


BL00027 26.43 6.664e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL5C002A 14.19 1.750e- 
12 1026-1045 ; 


653 


PR00253 


GAMMA- AMI NOB UTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301' 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS'* 




1 


20 422-441- 


654 \ 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 126- 
156 PD01719A 12.8? 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I end HMG- Y DNA- 
binding domain proteins 
(Ahook) . 


BL0O354C 6.61 8.397e- 
09 563-57? 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-59H 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 S39-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DMO021S 
19.43 4 .054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.l07e- 
10 544-577 


660 j PR00688 

1 


XYLOSE ISOMERASE 


PR006881 13.78 9.518e- 
09 224-236 


661 j BL00027 


•Homeobox' domain 

rwr\t- o inc 
UI ULc ills • 


BL00027 26.43 S.950e- 
23 249-292 




PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PRO 03 6 0 




PR0036OE 13.61 7.158e- 
10 596-610 


666 


PR0OB19 


CBXX/CFQX SUPERFAKILY 


PR00819B 10.83 B.900e- 
10 704-72C 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BLS0040C 22.62 2.143e- 
16 135-17*- 


668 


PR00019 


LEUCINE- RICH REPEAT 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 3.25Ce-10 
681-694 ELO0O18 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP- BINDING TRANSPORT 
TRANSMEMBR. 


PD00131B 34.97 l.OOOe- 
34 356-41C PD00131C 
19.59 l,346e-26 504- 
542 


673 


PR0O667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.557e- 
10 106-123 


674 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
65C PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2-800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12,15 
3.250e-09 593-608 


675 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.1l5e-12 614- 
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SEQ ID NO: 


ACCESSION j DESCRIPTION 
NO. 1 


RESULTS' 


i 

i 


i 




629 FR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.600e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676 


PROO019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.15 9.667e- 
09 24S-263 


679 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 3.700e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


680 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 266-296 


681 


BL00019 


Actinin-type actln- 
binding domain proteins . 


BL0001SD 15.33 4.200e- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


687 


PR00049 


WJLM'S TUMOUR PROTEIN 
SIGNATURE 


PR0D049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26 1.000s- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7. 80 l.OOOe- 
40 146-185 BL01024D 
13.22 1.000e-40 185- 
222 BL01024E 11.96 

I. 000e-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


• Homeobox * domain 
proteins. 


BL00027 26.43 8.07le- 
31 152-195 


f.92 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 45-57 


6 94 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5.050e- 
09 58-70 


C96 


BL006 80 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL0O68O 14.37 5.304e- 
17 173-195 


C 57 


BL00741 


Guanine- nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


f.5-8 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14 .16 8.232e-2B 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR0004BA 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
14 7 PR0004 8A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEQ ID NO; 


ACCESS 2 OK 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
i.844e-ll 290-302 
BL00S23G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PRO 0046 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
FR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007G7A 14.84 8.941e- 
14 66-82 


708 


PR00761 


B3NDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


DM013S4Y 10.69 4.977e- 
38 42£-465 DM01354X 
13. B6 7.300e-34 376- 
415 DM013 54V 12.97 
4 .923e-l7 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL0003SD 21.67 7.54Se- 
27 450-496 BL00039A 
18.44 2.S37e-18 147- 
186 BL00039C 15.63 
2.216e-14 2BO-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PP00777 


Sialyl transferase 
family. 


PF00777C 18.60 4-035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00032A 16.80 3.750e- 
39 20-68 DM00032B 
15.41 2.688e-28 04-118 
DM00031C 12.79 1.300e- 
12 131-242 


719 


BL00243 


Integrins beta chain 
cyeteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 3L00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
no a t n c c ~x 


720 


PR00217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11. OS 5.90Se- 
34 135-161 PR00704F 
13.61 7.O00e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-1B9 
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SEQ ID WO: 


ACCESSION 
NO. 


nvCfC T TJT T <TM 

DLbLKlrilUiv 


up cm t<; » 








23 75-98 PR00704A 

PR00704C 11.88 1.87le- 
18 99-llt 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-1E7 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD-4C 
REPEAT SIGNATURE 


FR00320C 13.01 2.125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 

{.DC. xrrCi/v J <£(sl~ J.J.UX 

4.522e-ll 323-338 
PR00320A 16,74 6.586e- 

±1 JZJ-jJO JtrKUUjZUlJ 

12.19 4 .343e-10 323- 

J Jo JrKUU^^UJD . 19 

6.914e-10 277-292 


731 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 286-307 PR00195E 
9.82 3.9l2e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-79f 


73B 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL00039A 16.44 2.565e- 
28 26-65 BL00039D 
21.67 2.10Se-20 338- 
384 EL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.6l7e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e~ 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BL0101S 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.078e- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 226- 
lb3 BJjOu53bbA lU.rw 
6.400e-l9 94-113 


747 


BL00021 


Kringle domain proteins. 


tit n n Pin n n /, /I Cf in 

25 231-273 BL00021B 
13.33 5.345e-2l 60-78 


748 


BL00612 


Osteonectin domain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVER IN FAMILY 

CTr"WTVTTTOC 
oJ.VjINA1UKD 


PR00450C 22.22 6.880e- 
10 135-157 


752 


BL00795 


Involucrin proteins . 


BL00795C 17.06 6.000e~ 

17.06 9.444e-ll 370- 
415 


754 


BLC0051 


Ribosomal protein L3 9e 


BL00051 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 Jew 2K632.12 YDR313C 
ENDOSOMAL III . 


DM01970B 6.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 35.35 9-020e- 
12 99-150 


762 


3X00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 33-88 


763 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 23.89 9.137e- 
10 206-240 


764 


BLOO027 


• Homeobox 1 doma i n 
proteins. 


BL00027 26.43 8.800e- 
29 417-46C 


767 


BL01208 


VWFC domain proteins. 


BL01208B 3 5.83 6-063e- 
10 309-324 BL01208B 
15.83 8.03le-10 165- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








180 BL01208B 15.83 
4.l62e-09 85-100 


770 


BLO003i 


Nuclear hormones 
receptors DNA-bindinc 
region proteins . 


BL00031A 19.55 9.57le- 
32 -208-241 3L00031B 
22.25 5.500e-27 242- 
274 


772 


PR0044 9 


TRANSFORMING PRCTEIN P21 
RAS SIGNATURE 


PR00449A 13 .20 1.450e- 
16 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR0044SC 17.27 
3.032e-13 44-67 
PR0O449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.4£5e-ll 27-44 


773 


BL0O523 


Sulfatases proteins. 


BL00523E IS. 27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B B.64 2.607e- 
13 91-103 BL00523D 
9.e9 7.923e-l2 224-236 
BLC0523C 12.64 4.5I2e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0OO28 


Zinc finger, C2KZ type, 
domain proteins . 


3L0C028 16.07 7.6B6e- 
09 568-585 


776 


BL0OO26 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621-638 


777 


BL00028 


2inc finger, C2K2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 8.4l2e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 


779 


PR0007S 


GLUCOSE- 6 - PHOSPHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.92Se- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 FR00079A 
16.12 6.769e-13 169- 
183 


761 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15.02 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD00299 


PROTEIN SH3 DOMAIN j PD00289 9.97 6.276e-09 
REPEAT PRESYNA. J 159-173 


785 


BL00690 


DEAH-box subfamily ATP- 
dependent helicases 
proteins . 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-l0 114-124 
BL0069OC 7.51 3.189e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17 .27 8 .500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13 .50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


78B 


DM01206 


CORONAVI RUS NUCLECCAPS ID 
PROTEIN. 


i DM01206B 10.69 8.767e- 
10 1-21 


790 


BL00915 


Phosphatidyl inositol 3- 
anc 4-Jcinases proteins. 


EL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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| NC. 


DESCRIPTIOK 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 7S5-e31 
BL00915A 10.09 i.OOOe- 
13 395-407 


791 


PR00208 

/ 


GLIADIN AND LMW GLUTEN IN 
SUPERFAMILY SIGNATURE 


PR0020BA 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00206A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00206A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.55 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-oy j / 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neurontodulin (GAP- 43) 
proteins . 


BL0O412D 16.54 4.000c- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 l.B27e- 

16.54 1.9l8e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


EL00021 


Kr ingle domain proteins. 


BL00021B 13 .33 6 .339e- 
13 40-58 


799 


BL01052 


Calponxn family repeat 
proteinc . 


BL01052C 18.51 l.OOOe- 

*1Q al-XJLt JrJJL>UJ.UDzrt. 

16.12 1.529e-32 3-35 

25 52-78 BL01052D 
10 26 5 737e-25 174- 
194 


800 


BL00346 


p53 tumor antigen 
proteins. 


BL00348F 23.19 3.714e- 
09 197-240 


801 


BL00305 


Vertebrate galactoside- 
h)indincj lectin proteins . 


BL00309C 18.65 1.621e- 
09 62-87 


602 




OLFACTORY HECEPTOR 
SIGNATURE 


PR0O245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16 47 8 457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 S.875e- 
09 12-20 


810 


PD02346 


PHOTOSYSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e~ 
09 317-354 
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NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS . 




811 


EL0068S 


CBF-A/NF-YB subunit 
proteins. 


BL00665B 14 .41 6.779e- 
14 54-95 EL00665A 
11.22 4.798e-l3 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


EL00357 


Histone H2B proteins. 


BL00357 7.74 1.908e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PDO0O66 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-l4 46-59 
PD00066 13. S2 7.000e- 
•14 18-31 PD0006f 
13.92 7.000e-l3 130- 
143 PD00066 13. SI 
7.50Oe-l3 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 
13.92 4.429e-l2 186- 
199 PDC0066 13. 9: 
1.783e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins . 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0S20 


Interleukin-10 iamily 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubicjxiitin carbo>ryl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 B.ll3e- 
09 224-242 


825 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 


FLAVOPROTEIN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732c- 
28 88-124 PD028S5B 
8.36 6.478e-09 132-142 


830 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR0O4O5A 17.71 7.283e- 
13 25-45 


831 


PR0C019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019E 
11.36 1.720e-09 136- 
150 PR00019B 11.3b 
3.680e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-l6 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14. C3 9.B52e-ll 212- 
231 


834 


PD0O306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-30C 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 3.856e- 
09 78-111 


835 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEOPROTE I N . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-2J 4S1- 
510 PR00700C 13.17 
4 .750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEC ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








11 538-545- PR00700E 
17.57 3.100e-10 522- 

CI V 


64 ■ 


PR00109 


TYROSINE KINASE 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-15~ 


0 *iy 




L22 RNA- BINDING HEP . 


40 S8-112 PD02785A 


845 


BLC0826 


KARCKS tamiiy proteins. 


BL00526C 7.63 6.738e- 

U7 U ,5 « J l* 


846 


BL00518 


Zinc finger, C3HC4 type 
Iajhu iiumcij , proteins. 


BL00518 12.23 4.429e- 

J.U ID 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


8 51 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-28C 


8 52 


BLO0420 


Speracr receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.32le-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 3L0C420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 35S-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11,90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 1.000c- 
40 756-811 BL00420B 
22.67 1.32le-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 3L00420B 22.67 
4.20Se-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
3JU00420C 11.90 1.900e- 
13 355-366 BL00420C 

852 BL00420C 11.90 
3.550e-12 248-259 
BL0C420C 11.90 2.831e- 
11 141-152 BL0042OC 
11.50 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








7.955e-10 567-578 


857 


PR0O388 


3 ' , 5 ' -CYCLIC NUCLEOTIDE 

class i: 

PHOSFHODI ESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


B5? 


BL0OO3Q 


Eukaryotic KNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 2.92Se- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


se- 


PR00986 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988F 12.23 
7.B28e-l5 158-212 
PR00S88E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR009B8B 11.60 4.5l2e- 
10 60-72 


es 3 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6 .8S0e-15 267- 
286 PR00775F 12 . 76 
6.7696-14 249-267 


866 


DMOl 68 e 


2 POLV-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


866 


BL01287 


RNA 3 ' -terminal 
phosphate cyclase 
proteins . 


BL01287A 17. 95 2.688e- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL0004 6 


His tone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biotin- requiring enzymes 
attachment site 
proteins. 


BL00188 30.29 9.036e- 
32 665-711 


876 


BLC0028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD021O2A 16.74 4.l76e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins . 


BL01189A 14.27 l.OOOe- 

Aft "J C 71 OT>fM 1 QOQ 

40 35-71 oliOlXo^D 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


689 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


rKUU J31 


fHOSPHAi IDYljINObX XULt 
TRANSFER PROTEIN 
SIGNATURE 


rKUUJ . 5U 1,1 0 

15 211-231 PR00391B 
8.39 1.000e-13 83-104 
FR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR0032 7 


ICE NUCLEATI0N PROTEIN 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* I 

j 






SIGNATURE 


09 313-328 


896 


BL00039 


DEAD- box subfamily ATP- 
dependent held cases 
proteins . 


BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC-FINGER 
METAL- BlNDI . 


PD00066 13.92 8.200e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13,92 
B.200e~lG 310-323 
PD00066 13.92 8.2O0e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.20t)e-14 338-351 


902 


BL01115 


GTP-bir.aing nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCUL1N SIGNATURE 


PR00806B 4.28 9.l60e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381E 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PR00381F 9.13 7.l8le- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8,364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR0C345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4 .54 6 .557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 513-537 


908 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 9.308e-ll 
144-15S 


910 


PD010 66 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.8Q0e- 
30 48-87 


912 


BL01104 


Ribosomsi protein L13e 
proteins . 


BL01104C 15.14 6.000e- j 

r\f\ 1 C ft 1 c *5 


922 


3L00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 S.67 3.842e-0S 
500-511 


923 


PRO03 20 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLGROPHYLLIDE 
REDUCTASE PHOTOSYNT . 


PD02181D 12.85 8.609e- 
09 36-54 






Actinin-type actin- 
binding domain proteins. 


DliUl/UXilt, is .OO / .1 JjC" 

25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9-338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp J WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION ! 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
l.b00e-iC 314-32r 
EL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
6.579e-09 206-217 


929 


BL0C518 


Zinc linger, C3HC4 type 
(RING finger), proteins. 


EL00518 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulcse-phosphate 3- 
epimerese family 
proteins. 


BL01085D 16.55 4.600e- 
24 134-165 BL01O85B 
10.15 S.680e-22 30-52 
BL01085E 18.87 8.676e- 
20 172-202 BL01085C 
21.81 2.038e-l4 66-97 


93X 


BL0106S 


Ribulose -phosphate 3- 
epimerase family 
proteins. 


GLC1085D 16.55 4.600e- 
24 152-183 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 

rt rt i a a rt rt T rt -» rt o cr /**» 

20 190-220 BL01085C 
21.81 2.038e-14 66-97 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301A 10.24 6.400e- 
09 160-171 


936 


PF00168 


C2 domain proteins. 


PFO0168C 27.49 4.000e~ 
12 336-362 


937 


BL00415 


Synapsins proteins. 


BL00415N 4-29 9.5l9e~ 
10 5-49 


94 0 


PR00862 


PROLYL OLIGOPEPTIDASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


PR00862D 16 .17 4 .086e- 
09 63-84 


945 


BL01230 


RNA methyltransf erase 
trmA family proteins. 


BL01230B 11.62 2.373e- 
09 407-420 


94 8 


BL00479 


Phorbol esters / 
diacylclycerol binding 
domain proteins . 


BL00479B 12.57 7.429e- 
18 52-68 BL004 79A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIK OXIDOREDUCTASE 
NAD INTERGENIC RE. 


PD01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as 3R- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


956 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3 ,250e- 
12 47-60 


957 


BL00379 


CDP-alcohol 

phosphatidyl transferases 
proteins . 


BL03379 24 . 64 1 .610e- 
15 111-148 


959 


BL01115 


GTP-bincing nuclear 
protein ran proteins. 


EL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-bincing nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/reductase 
9 family proteins. 


BL00061B 25.79 6.586e- 

m — « rvn rt rt r* 

U 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 


PR0O308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


FR00308A 5.90 7.03Se- 
09 55-70 


967 


DM02206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.2B6e- 
12 104-124 DM01206B 
1C.69 5.299e-ll 23-43 
CM0120SB 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
126 DM01206B 10.65 

D.C/i.C U3 JO OO 


969 


PF01008 


Initiation factor 2 
subunit . 


PF010083 25.59 4.724e- 
31 417-460 PF01008C 
15.25 5.333e-l8 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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RESULTS* 


97C 


3LC1277 


Ribonucleaee FI-: 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-l0 40-76 


975 


BLC11S9 


WW/rsp5/WWP domain 
proteins . 


BL.01159 13.85 3.60Se- 
12 130-145 BL01159 
13.85 4 .I22e-10 171- 
186 


977 


PF00791 


Domain present in 20-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.96 2 .235e- 
09 55-94 


978 


BL01167 


j Ribosomal protein LI 7 
proteins. 


BL01167B 20.66 B.258e- 
19 B8-127 


979 


BL00476 


LIM domain proteins. 


BL0047BB 14.75 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PRO0312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 

33 92-122 PR00312B 

PR00312G 11.11 6.657e- 

11.70 6.914e~27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 B.816e- 
09 414-449 


982 


PR002 99 


alpha. cry<?tai.*.7N! 
SIGNATURE 


09 127-149 


983 


BL0115C 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BLC0795 


Involucrin proteins . 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL0O795C 17.06 3.407e- 
10 14-59 BL00795C 
i / . vb /.cuiB-iu 4. i 
BL0O795C 17.06 8.64 0e- 
10 19-64 BT.0079^C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


3L00939 


Ribosomal protein Lie 
proteins. 


BL00939F 17.27 5.393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


FR00452B 11.65 6.538e- 
11 497-513 


994 


3L00027 


1 Horneobox • doma i n 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL02304 


ubiH/C0Q6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM0176 7 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.2S0e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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20 174-193 PR00926B 
16.07 2.12Se-18 24-39 
PRO0926A 10.41 l.OOCe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


BL00406 


Actins proteins . 


BL00406B 5.47 l.OOOe- 
40 86-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.700e- 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B S.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 


PRD03O4 


TAILLESS COMPLEX 
POLYPEPTIDE 1 
(CHAPERONE) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
PR0O3C4B 11.60 7.577e- 
19 68-87 PROC304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- 
32 261-302 PD0C930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphogly cerate mutase 
family phosphohist idine 
proteins. 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 .75 8.062e-10 79-111 


1025 


PRO0305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


3L0Q353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 3L00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL00183 


Ubigui tin-conjugating 
enzymes proteins. 


BL001B3 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGSNASE/EPOX1DE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0I066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4 .259e- 
11 55-82 


1039 


BLO0299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADF- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143e- 
20 56-7B PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.S6 2.154e-l8 154-171 
PROC570F 12.30 l.OOOe- 
16 224-241 PR00970G 
9.97 9.229P.-15 243-25B 
PR00970B 16.37 1.290e- 
13 66-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR0O048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PROC04 8A 1C.52 6.786e- 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C - 1 vnp 1 p rt i n dnir^ i n 
proteins . 


BL00 6T5A 16 68 1 72 0e- 
11 228-236 BL00615E 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- I proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


104 V 


BL01216 


ATP-citrate lyase / 
succinyl - CoA licrasss 
family proteins. 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 




xrtn\jxi \j\3Xj\jd\ju J n v a x ui* . 


* kJl'l V J XO JL 3 . T X 1 . u X. v c 

12 102-136 


1050 


BL01073 


Ribosomal protein L24e 


BL01073 24 .20 l.OOOe- 
40 12 - 62 


1054 


BL00571 


Amidases proteins. 


BL00571 25.69 5.B75e- 
31 160-212 


10S5 


BL.00030 


Eukaryotic RNA-bi riding 
region RNP-1 proteins . 


BL00030A 14.39 5.235e- 

11 98-117 BL0003OB 

/ . U J ^ . J J. otr - U J* / -n / 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL0C223C 24.79 8.7S4e- 
*5i o^9-ii"7 m.nn9'?i2i 
15.59 9.478e-14 46-GO 
BT.00771R 1^ ^9 5 ^^"Jg- 
11 118-152 


1060 


BL00027 


■ Homeobox 1 doma in 


BLOC027 26.43 3.455e- 


1064 


BL00455 


Putative AMP-binding 
domain proteins. 


8L0C45S 13.31 6.211e- 
13 280-296 


1065 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00029A 11.19 2.00Ce- 
09 115-129 PR00019E 
11.36 3.880e-09 87-101 


1066 


PRO0326 


GTP1/OBG GTP- BINDING 

DOfYFCTW FQVJTT.V QT^MATTIPF 
FKUiillli r i\s\XXii OllrPIHlUKL 


PR00326A 8.75 4.600e- 

XO X Z> X - X 1 4 rKVUJ^OL 

9.79 1 .290e-14 200-216 
PROO^^eB 16 74 8 548c- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 8.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PF00G56A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1/Ag5/PR- 1/Sc7 
proteins . 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYFEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.316e-09 | 
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SEO ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 






proteins proteins. 


298-305- 


1061 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 7.39Ee- 
10 23-57 




B1.00460 


Glutathri one peroxidases 
se j enocys teine proteins. 


BL00460A 28.67 3.204e- 

9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


109b 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 
FIMBRIA TRAN. 


PD0281IA 20.67 3.017e- 
22 67-10S PD02811B 
17.07 2.2S3e-21 118- 
151 PD02811C 13.25 


109* 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PI LB 

CTT'CtJT7> "DAM 


PD02811A 20.67 3.017e- 
22 60-98 PDC2811B | 

144 PD02811C 13 .25 

3 . D^DC 1J It f lull 


10S7 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 20G-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e- 
13 11 1-14 7 


110S 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3 . 077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


111£ 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR004 05B 11.83 5.737e- 
20 42-60 PR004Q5A 
17.71 2 .703e-17 23-43 
PR0U4Ubt- 1:7.41 b . su^e- 
10 53-8b 


lllf 


BL003bb 


HMGi^ and HMG17 
proteins . 


20-51 


1117 


BL00355 


HMGjl4 and HMG17 
proteins . 


BJ->UU3bb b.y/ z .biJbe-ZD 

20-51 


1 1 2 C: 


BLOU107 


Protein kinases ATP- 
bincing region proteins. 


£}JjUJXU/JD J. J . Ol S.Oa/c" 

10 290-306 


1122 


PR0O412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


112S 


PR00186 


HEKERYTHR3N SIGNATURE 


PR001B6A 13.62 2.800e- 

nn o *-J -ion 

09 87-103 


1125 


BL00170 


Cyclophil in- type 
pepti dyl- prolyl cis- 
trans isonterase 
sicna tur . 


BL00170C 18.49 3.077e- 
33 84-129 BL00170B 

1 o.oJoC ^SO All* 

BL00170A 17.08 3.455e- 


1131 


BL00636 


Nt-cnaJ domain proteins. 


BL00636A 8.07 5.304e- 
ic 99^46 BL.00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp <WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
coTnpl exes medium chain 
proteins. 


BL00990C 18.78 4.176e- 

21.44 4.316C-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 100-128 PR00314D 
9.66 3.S31e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NC. 


DESCRIPTION 


RESULTS* 

i 








32 159-188 PR00314A 
14.53 1.28le-22 13-34 


1135 


BL01115 


GT?-bindir.o nuclear 
protein ran proteins. 


BL0I115A 10.22 6.364e- 
13 13-57 


1143 


BL00107 


Protein kineses ATF- 
bindinc region proteins. 


BL00107A 18.39 4.00Ge- 
19 451-482 BL00107B 
13 .31 3.077e-12 519- 

535 


1146 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTE I N I MMUNOSLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652E 
8.50 9.463e-10 740-792 


1157 


PD02 894 f HYDROLASE N4- PRECURSOR 
] PROTEIN SIGNAL BE. 

i 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxadoreauctases 
proteins . 


BL00623E 15. CO 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 . i DNA PROTEIN POLYMERASE 
| ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 330-341 


1162 


PD01937 | DNA PROTEIN POLYMERASE 
| ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 

337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.3 84e- 
09 302-350 


1177 


BL01032 


Protein phoephatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1176 


PR00320 


G- PROTEIN BETA WD -40 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 
8.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-7B.4 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


BLC0720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BLC0215 


Mi tochondri a 1 energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.761e- 
10 77-93 


1188 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6-0OOe- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19-67 
3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


XXV X 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 | 


1193 


PRO 03 4 5 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- ; 
28 72-101 PR00345B j 
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SEO ID NO: 


ACCESSION | 
NO. 


DESCRIPTION 


RESULTS* | 

t 








8.54 7.6S2e-28 149-174 ! 
PR00345C 4.54 9.100e- 
28 101-125 PR00345E 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-lG 43-62 


US 4 


PR00345 


STA7HMIN FAMILY 
SI G NATURE 


PR00345B 7.12 2.80OC- 
28 108-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4 .54 9.100e- 
28 137-161 PR00345D 
10.97 1.9643-24 161- 
185 PR00345A 13.46 
5.645e-16 79-98 


119£- 


PF00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00992 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 6 . 738e- 
11 15-47 


1197 


BL01296 


Di hy drodipi col ina t e 
reductase proteins. 


BL01296A 13.90 5.959c- 
09 51-73 


1203 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BLC0061B 25.79 l.OOOe- 
14 152-190 


1204 


PR0O118 


E ETA -LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e- 
09 213-229 


12C€ 


BL01183 


ubiE/COQS 

me thyltransf erase family 
prut e ins. 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


120£ 


BL00979 


G~protein coupled 
receptors family 3 
proteins. 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PFC0023 


Anfc repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PF00023B 
14.20 1.8l8e-09 45-55 


1212 


PR00 04 8 


C2H2- TYPE ZINC FINGER 
SIGNATURE 


PR00040A 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.3l6e-ll 199- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412D 16.54 5.598e- 
10 179-230 


121S 


PR00456 


RIEOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 5.348e- 
11 249-264 


1222 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BINDI . 


PD00066 13.92 7.231e~ 
15 295-308 PD00066 
13.92 7.23le-15 406- 
419 PDOC066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile . 


BL50058 27.23 l.OOOe- 
40 13-61 


122b 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 13.82 l.OOOe- 
40 49-101 BL00437B 
16.28 l.OG0e-4D 114- 
168 BL00437C 21.86 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
40 248-301 BL00437E 

71 cjS 1 000*»-4n 177- 
379 


i O ~\ r 




Mtiesin i..Tync ciiaii. 
repeat proteins. 


DXjUXXOUU 1?. U.t7/C" 

10 5-60 


16J X 


rKUU / J i? 


FAMILY 8 SIGNATURE 


1 ,l\v \J / j Ot\ J. X • X " C.Oj/c 

09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 


PR00497A 6,92 5.553e- 

X v i30 N J. / 0 


1233 


PRO 04 97 


NEUTROPHIL CYTCSOL 

ITTi PTHD n/t ft 0°TT , KI7ST , TTDir 

rAk,A*JN P4U bitsNAiUKr. 


PR00497A 6.92 5.553e- 
Tfl icq in c 

IV X30-X/0 


1235 


BL00866 


Ca r bamoy 1 - phospha t e 
synthase subdoma in 
proteins. 


BL00866B 36.29 2.776e- 

ft Q 7C_ 101 

vy i j x c x 


123T 


BL00027 


•Homeobox' domain 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.104e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LI GAS E 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01368L 
9.47 4.490e-10 174-189 
PDOllooJj 9.47 /.blze- 
10 183-198 


1249 


BL00016 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.800e-l0 
183-196 


1254 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 8-52 


1256 


BL00372 


Phosphor ibosylg ly c inamid 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PROOOi; 


TYPE III EG F- LIKE 
SIGNATURE 


PR00011B 13.08 3.217e- 
10 174-193 


12SS 


BL00518 


Zinc finger, C3IIC4 type 
(RING finger), proteins. 


BL00518 12.23 8.286c- 
10 31-40 


1261 


PR0007C 


D1HYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-327 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 S.SOOe- 
12 16-27 


1262 


BL00462 


Gamma- 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.8.8 5.500e-20 230- 
267 BL00462C 27.41 


1263 


BL00038 


My c- type, ' helix- loop- 
helix' dimerization 
domain proteint . 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 5.670e- 

ii n ci 
XX 1 /-faX 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 

1 7 701 -7 1 ^ 
XZ ZUX "4iD 


1269 


PR0044S 


TRANSFORMING PROTEIN P21 

D7\C OtPM7\TIIDr 

KAo blbNAlUKL 


PR00449C 17.27 9.308e- 

13.50 1.000e-16 137- 
xou rKuim xu. /j 
3.520e-ll 102-116 


1270 


BL00276 


Channel torming colic ins 
proteins. 


BL00276A 8.87 l.SOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9-769e- 
09 228-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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Sty 11J iNU: 


ACCESS 3 CN 
NO. 










O 1 bKAl UKL 


11.30 1.857e-ll 165- 
179 PR00412A 13 .23 
3 400p-li 100-11° 


1277 


PF007St 


Putative esterase. 


PF00756C 14.12 9.53Be- 
10 127-157 


1275 




Serine proteases , 
trypsin family, 


13 128-145 


128C 


BL01220 


Phosphatidylethanolamine 
-binding protein family 


BL01220C 14.75 9.34Be- 
15 248-276 


1285 


BLOOSlfr 


Zinc finger, C3KC4 type 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00793 


Domain pr.esent in ZO-1 

^nr] t Tn r-» C 1 *! V p nefri n 

dnu ijntj— > j. i-Ac iic ii in 

receptors. 


PF00791B 28.49 7.182e- 


1 "3 Q O 




SIGNATURE 


Donnnnon ic ci i cine 

rKUUOUAD lo .OX JL.OJ.Ue- 

10 81-105 


1297 


PR007U 


M- PRASE INDUCER 


PR00716C 17.65 5.696e- 


1298 


BL0 04 7E 


LIM domain proteins. 


BL00478B 14.79 6.478e- 


1301 


BL00127 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4.250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 
15.82 I.000e-16 22b- 
251 BL00215A Id. 82 
2 .658e-13 107-132 


1308 


PRC0898 


VASOPRESSIN V2 RECEPTOR 


PR00898H 11.34 4.682e- 

U9 3D^~ D 1 Z, 


1309 


PDC030I 


PROTEIN REPEAT MUSCLE 
CAIjCjlUM-BI . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


3L00194 


Thioredoxin family 
proteins . 


BL00194 12.16 1.900e- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.96Se- 
10 53-97 


1316 


3L00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.32Se- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
proteins . 


BL00783C 22.43 6.559e- 

14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 

J.j£ ft - DO 


1327 


PFD0514 


Armadillo/beta- catenin- 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL0003 0 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030E 
7 03 4 789&-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 7..239e- 
09 25-43 


133 2 


PP001 fil 
r[\u Uioi 


HYDROGENASE/B - TYPE 


09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.76Se- 
33 10-49 


1336 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


PR00700D 12.4 7 2.200e- 
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Jiff P C C T /"M\l 

NO. 




i\ jL O L/ 1_j X v_ 






PHOSPHATASE S "^NATURE 


05 711-23 0 


1340 


PR00B60 


VERTEBRATi 
METALL0TH1 ONEIN 
SIGNATURE 


PR00860A 5.46 5 .034e- 
13 5-18 


1341 


BLO0BS3 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL0I2B2B 30.49 5.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE. 


DK00099B 14 .73 8.313e- 
09 417-427 


1345 


BL00923 


Aspartate and glutemate 
racemases proteins. 


BL00923B 11.41 5.935e- 
10 135-146 


1346 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- 
13 44-57 


1350 


PR0O193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14,36 3.571e- 
32 416-445 PR001S3C 

lZ.oO b.Jlce-31 1/5*- 

207 PR00193B 11.69 
3.571e-24 133-15S 
PR00193E 15.47 9.069e- 
22 470-499 PR001S3A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13 .54 3 .408e-15 200- 
224 PR00447A 12.72 
6.357e - ll y/-i^4 
PR00447G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


S-100/lCaBP type calcium 
binding protein. 


BI.00303A 21.77 6.667e- 
26 45-82 BL00303F 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD-box subfamily ATP- 
dependent helicases 
proteins . 


BL00029D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136<5-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039E 
19-19 3.182e-14 14-t- 
167 


1357 


PF0O615 


Regulator of G protein 
signalling domain 
proteins . 


PF00615B 16.25 2.2l6e- 

10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.234e- • 
29 10-49 


J. JO J- 




PROTEIN HMG17 FAMILY 
SIGNATURE 


18 14-29 PR00925E 
3 .73 6 .143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
<*fi 1 8^7^-10 76-R7 


1362 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL012723 19.61 6.870e- 
30 136-171 BL01272C 
21.68 3.324e-25 24£- 
274 BL01272A 6. 49 
1 231e~18 99-117 


1363 


BLG1272 


Glucokinase regulatory 


BL0I272B 19.61 6.670e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6. 49 
1.23le-18 76-94 


1364 


DM0O179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS"* 








10 1-19 


1371 


BL00242 


Integrins alpha chain 
proteins . 


EL00242B 8.13 8.6l5e- 
09 469-479 


1372 


PR00625 


CNAJ PROTEIN FAMILY 

f> T KITS «T*t YT* t- 

SIGNATURE 


FR00625E 13 .48 7 .353e- 
19 46-67 FR00625-H 
12.84 l,39le-16 14-34 


13 73 


BL00434 


HSF-type DNA- binding 
domain proteins. 


BL00434C 23.65 3.770e- 
09 90-130 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 
PKOTklN SlGNAIuKr. 


PR00952C 8.00 6.337e- 
05 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE . 


PD02475A 23 .18 8 .552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
£lNC-rINGER METAL- 
BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


13 80 


BLO0194 


Thioredoxin family 
proteins . 


BLC0194 12.16 8.333e- 
12 48-6J 


1381 


DM01570 


0 kw ZK632.12 YDR313C 
ENDOSOMAL 111 . 


DM01970B 8.60 1.458e- 
15 1123-1136 


1383 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 


Bb00303 


S-100/ICaBP type calcium 
binding protein. 


3L00303B 26.15 6.2Q3e- 
10 95-132 


1386 


BL0116C 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 S.042e- 
09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER KETAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.5l2e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3 .83 9.723e- 
10 127-137 


1393 


PR003BO 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14 .18 9.625e- 
25 88-llC PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13:18 6.538e-16 243- 
262 


13 94 


PDOO066 


PROTEIN ZINC- FINGER 
METAL- BIND 1 . 


PD00066 13.92 3.400e- 
14 4-62-475 PD00066 
13.92 8.800e-14 346- 
361 PD00066 13.92 
9.571e-12 405-418 
FD00066 13 .92 6.0B7e- 
11 490-503 PD00066 
13.92 8.043e-ll 320- 
333 


139B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- r 1NGER METAL- 
BINDING NU. 


PD01066 15.43 6.786c- 
32 10-49 


14 00 


nk/ini one 
Vrl U JL £ U b 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-29C 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 2S.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryotic RNA- binding 
region RNP-l proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


14 06 




LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








09 176-190 


140£ 


PR00510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
ZjO PKOOblUr 9.88 

8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 




PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR . 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL0C358 


Ribosomal protein LS 
proteins . 


BL00358B 22.76 l.OOCe- 
40 57-103 EL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.86 7.338e- 
10 511-534 


1411 


BL00023 


Type 11 fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


FR00681 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


141t 


DMO 09 73 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEX1MIDE. 


DM00973A 21.17 1.462e- 
09 171-208 


1415 


FR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


142C 


FD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049C-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.1l8e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


14 22 


FR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


14 23 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PK00209B 4.88 6.318e- 
11 1009-1028 


1424 


BL5O0O2 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14.19 9.250e-12 298- 
317 BL50002A 14,19 
4.462e-ll 208-227 
BL50002B lb.JLo l.UUOe- 
09 244-258 


142i 


PF00628 


PHD- finger . 


PF00628 15.84 3.045e- 
12 330-345 


1426 


ire 0062 o 


PHD- finger . 


PF00628 15.84 3.045e- i 
12 377-392 


1427 


PR0040S 


HIV REV INTERACTING 
fKLT I is J.N hi taNATuRE 


PR00405B 11.83 5.114e- 

T r aa. i fin T">T3 ft ft A ft £ * 

16 281-299 FKUU40bA 

17.71 4.306e-14 262- 

COZ 


1426 


BL0003 9 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


14 2 5 


PK0032U 


G- PROTEIN BETA WD- 40 

rvCi r E*r\ J o 1 olMrt 1 U JX£* 


PR00320C 13.01 8.920e- 

1 n R*7*7 - R 09 
If O / / - 376 


1430 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR00928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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SEQ ID MO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1432 


BL01113 


Clq domain prcteins . 


EL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR0033S 


BETA G- PROTEIN 
( TRANS DUC IN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-bindincj 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1436 


BL0029C 


immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 25C-26B BL0029OA 
20.89 4.000e-09 188- 
211 


14 4 0 


PR00606 


VINCULIN SIGNATURE 


PR008063 4.26 4.960e- 
09 38-52 


1441 


PR00806 


VJNCULIN SIGNATURE 


PR00806B 4.28 4.i>60e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 1.000c- 
08 114-138 


1445 


PD01843 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 

1 


PD01841A 21.71 l.OOOe- 
40 73-123 PD018413 
14.35 1.000e-40 144- 
185 PD01841D 17.87 
1.000e-40 206-258 
PD01841F 13.36 1 . OOOe- 
40 296-345 PD01841G 
24.26 1.000e-40 349- 
403 PD0I841I 23.00 
1.000e-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 1.000e-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01B41H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF0081* 


K-NS his tone -family . 


PF00816B 13.84 8.875e- 
09 190-220 


1447 


PR0004c: 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 


1446 


DM0031E 


072 R I BONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL0003G 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM016 8f- 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.146e- 
09 382-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


146C 


BLO0S4h 


Aldose 1-epimerase 
proteinB. 


BL00545C 11.28 7.353e- 
17 169-182 BL0054SA 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR000S7 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL0079C 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF0O68* 


Starch binding domain 
proteins. 


PF00686A 13.45 9.100e- ' 
09 267-277 | 
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1477 


PF00566 


Probable rabGAP domain 
proteins . 


PF0C566A 12.64 7.333e- 
10 466-476 


1476 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


i 1475 


DM00406 


GLIADIK. 


DM00406 7.73 8.541e-10 
292-305 


1460 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 2.385c- 
15 69-87 BL00290A 
20.85 5.091e-ll 12-35 


1481 


PR0015C 


PHOSPHOENOLPYRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.03Se- 
09 21-51 


1462 


PF00780 


Domain found in NIK1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 l.lS3e- 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.905e- 
25 17-56 


1486 


BL00107 


Protein kinases AT?- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 34-50 


I486 


BL0 0039 


DEAD- box subfamily ATP- 
dependent helicases 


BL00039D 21.67 9.586e- 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 

5.357e-ll 93-115 


1 4 SI 




Guanylate cyciaseE 
proteins . 


31 63-106 DL00452E 
11.92 3 .045e-13 115- 
131 


1 4 S2 




SIGNATURE 


PR0OO19A 11 19 3 667e- 
09 532-546 


1497 


m n n 1 m 
jaijUUiu / 


binding region proteins. 


R1.001G7B 11 11 1 000e- 
11 384-400 BL00107A 
18 39 5 345e-ll 322- 
353 


1 50C 


rr w v o / \i 


On Tf* f ami 1 v 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 


1 Homeobox ■ domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


•Home obox ' doma in 
proteins. 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 

pr OLciris * 


BL01177E 20.64 5.600e- 
00. BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-172 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL0O972 


Obiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 6.755e- 
10 341-363 


1512 


3L00523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


3LO0600 


Aminotransferases ciass- 
III pyridoxal- phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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331 BL006C0G 12.4? 
9.625e-17 377-396 
BLU0600B 11?. 6U b.Oble- 
15 160-186 BL00600C 

ic no c f\At- a ^-\r> i or. 
j.o.j.0 o.u^i»e*"J.A i?i<* 

206 BL006C0F 8.77 

1.000e-ll 343-356 

BL00600D 8.71 l.OOOe- 

10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIK 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1520 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PR00320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 j 
8.683e-09 272-287 
PR00320C 13.01 b.BOOe- 
09 106-121 ! 


153 8 


DM01970 


0 kw ZK632.12 YDR313C 
EHDOSOMAL III. 


DM01970B 6.60 4.508e- j 
15 171-184 


153? 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF007B1D 11.11 7.593e- 
10 103-127 


1540 


PR0O965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 S.846e-29 172- 
195 PR00965F 5.98 
l.l23e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR0O965G 8.52 2.440e- 
27 258-279 PR0Q965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 l.OOOe- 
25 35-55 PR0O965I 
3.91 6.442e-25 385-406 


1543 


BL01013 


Oxy sterol -binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 - 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 1 . 000e- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 -J.857e~ 
10 1D2-197 PR00049D 
0.00 7.102e-09 67-82 


154"/ 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 B.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 - 

a L>\J V z> Z) 1 ti i.1 .£J 

33 38-69 


154 8 


ahVVD J b 


Ubiqui tin-acti vat i ng 
en2yme proteins. 


30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAG IN AS E/ G LUTAM I N AS E 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR0004 9 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.13Se- 
09 58-73 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


1556 


BL0006: 


Short-chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof iamily 
proteins . 


3L0122oD i/.44 B.lUbc- 
12 107-132 


1S58 


BL0122E 


Hypothetical cof t airily 
proteins . 


BL01228D 17.44 8.10$e- 
12 107-132 


1559 


BL01226 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.l05e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 fi.SOOe- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.l23e- 
14 502-532 BL00522F 
14.90 2.385e-13 553 - 
575 


1563 


PFOO6S3 


BT3 (also /.ncwn as EP- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947c- 
11 46-59 


1564 


BL0029S 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 8.594e- 
17 184-228 BL01013C 
9.97 4 . 906e-12 14-24 


1567 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 3.400e-10 
378-389 BL00678 9-67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL0047S 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.66 6.62Se-15 273 - 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR00665 


OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR0066SD 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C 
5.89 1.000e-20 65-80 
PR00665B 5.29 4 .337e- 
19 24-39 PR00665F 
5.60 2.929e-l5 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00095 


4 kw A55R REDUCTASE 
TERMINAL 

D1HYDROPTERID1NE . 


DM00099B 14.73 9-3 08e- 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins. 


BL00524A 9,65 6.776e- 
14 52-73 


1580 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD028S4A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 
12 32-54 BL00411K 
15. do 4.441e-ll ^4b- 
276 


1582 


PR006Q4 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


T\Tl t\r\ C f\ A 11 "I I *5 /l ^ 

PR00604A 11. 1J X.^^ue- 

09 79-87 


-L Do *i 




mis lajiso xnown ae bk- 
C/Ttk) domain proteins. 


PFnofim i 6 ; do i oooc- 
10 225-238 


1585 


DM015SI 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1S86 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RP2. 


DM013S4S 11.61 7.750e- 
09 474-495 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS 4 


1567 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.95be- 
33 180-210 PR00072A 
12.75 6.040e~25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-lS 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


EL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191K 15.64 1.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III . 


DM01970B 8.60 7.716e- 
13 211-224 DM019703 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.62Se- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 101b- 
1026 


lb92 


BL00037 


Myb ZWA-binding domain 
proteins repeat proteins 
proteins . 


BL00037E 15.92 3-250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


159E 


BL00028 


Zinc fancer, C2H2 type, 
domain proteins . 


BL00028 16.07 1 .5l4e- 
09 110-127 


1596 


PF00626 


FHD- finger . 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


160C 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.S7le- 
10 30-35 


1602 


BL00412 


Neuromodul in (GAP- 4 3 ) 
proteins . 


BL00412D 16.54 5-402e- 
10 136-167 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3 .57le- 
10 44-57 


1607 


EL00252 


Interferon alpha, beta 
and delta family 
proteins . 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19-78 9.125e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.0l8e-09 127-168 


1612 


PFO0168 


C2 domain proteins. 


PF00168C 27.49 3,250e- 
09 36S-3S1 


1613 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 6.051e- 
09 932-9E3 3L00412D 
16.54 7.153e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL005591 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0C559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.0OOe- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANSFERASE BI . 


PD01427E 22.45 3.02be- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 



239 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCTAJS00/34263 



SEO ID NO: 
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NO . 
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471 


1616 


BL00115 


Eukaryotic RNA 
polymerase 13 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.485e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


161'; 


BL00303 


S-100/lCaEP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-247 


1615 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METKI . 


PD01888B 25.10 1 . OOOe- 
40 47-97 PDC1888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.S00e-15 7-23 


162 2 




PiL/JjijUowiiM rtn^iAjijro xii v_ 
TERMINAL TAIL SIGNATURE 


PRf>n??9F 3 Sfi 3 45 5e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1 ^fl 5 191e-0S 703-715 


1622. 


PR00860 


VERTEBRATE 
METALLOTHI ONEIN 
c i r.MRTi id tr 


PRC0860B 7.04 1 . 900e- 
18 27-41 PR00860C 
9 1 474e-14 41-51 
PR00860A 5.46 l,720e- 
14 5-18 


1624 


T10 AftT3/l 

PROO /04 


Ml IL/t-HUrJUKiAlj bKUHW rAJ 

UNCOUPLING PROTEIN 
SIGNATURE 


11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 1. OOOe- 
40 93-139 BLC0325A 
24 .83 6 ,786e-23 61-93 


1631 


BL00064 


L- lactate dehydrogenase 
proteins . 


BL00064B 23.57 1. OOOe- 
40 82-130 3LOO064C 
17 . 28 1 . OOOe-40 137- 
182 BL00064E 27.20 
i . u JUc-^u z^j — £. i z> 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
*>", 1C 1 000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


nun nnci 
PK0 0UO3 


KlBUbUi v lftlj PKU1JC.JLJM Lui / 
SIGNATURE 


11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TFPMTNAL, TAIL SIGNATURE 


PRO0239D 0.00 1.105e- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.3B8e- 
11 11-43 




DLtVi IXOJ 


LLUi Cj f W V'wJ 

methyl transferase family 
proteins . 


BL01183B 21.31 8-144e- 
12 132-177 


1640 


PR00015 


GRAM -POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13-01 
2.800e-10 279-294 
TR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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PRUUJ2UA 16.74 2.U98e- 
09 229-244 


1642 


PF00023 


Ank. repeat proteins. 


PF00023A 16.03 6.464e : 
09 114-130 


1643 


FR0016S 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 I.806e- 
11 74-94 


1644 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2 .200e-10 
109-120 BL0O678 9.67 
5.737e-09 526-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


3L01108A 20.33 7.366e- 
17 56-89 


1646 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR003B0A 14.18 9.270e- 
21 1C3-125 FR0038OD 
9.93 6 .3006-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR0038OB 
12.64 6.657e-15 292- 
310 


1647 


DM01242 


3 THREONINE- -TRNA 
LIGASE. 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 B.054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


1651 


BLOiieo 


Kinesin light chain 
repeat proteins . 


BL0116UB 19.54 6.720e- 
11 431-485 


1652 


EL00933 


FGGY family of 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL0C933E 
13.80 9.217e-09 456- 
472 


1653 


BL00795 


involucrin proteins. 


BL007S5C 17.06 2.988e- 
10 70-115 


1654 


EL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7 . 750e- 
17 302-334 


1655 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.750e- 
17 282-314 


1656 


EL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391e- 
16 607-630 


1657 


FR00449 


TRANSFORMING PROTEIN F21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1656 


PRO 09 10 ! LUTEOVIRUS ORF6 PROTEIN 
i SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


EL00972 | Ubiquitin carboxyl- 
| terminal hydrolases 
j family 2 proteins. 

j 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 | Actlns proteins. 


BL00406D 12.58 B.767e- 
15 188-243 


1661 


PR00105 


CYTOSINE- SPECIFIC DMA 

METHYLTRANSFERASE 

SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


EL00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 3-172e- 
33 3119-3163 


1663 


PR00319 


BETA G-PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR0C319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PRO0319B 
11.47 8.200e-l9 70-85 
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RESULTS - * 


1664 


BL0O01& 


EF-hand calcium-bindinc 
domain proteins. 


BL00018 7.41 S.050e-10 
489-501: 


1667 


PDO1066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


166^ 


BL01153 


NOLl/NOP2/sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 115-141 BL01153C 
13.67 8.577e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR0067& 


PI3 KINASE P8S 
REGULATORY SUBUNIT 
SIGNATURE 


PR0067BH 9.13 3.100e- 
10 1146-1169 


1672 


BL00596 


Chromo domain proteins. 


3L00598 14.45 8.500e- 
20 27-49 


1673 


PR0032€ 


GTPl/OBG GTP-BINDING 

P ROTE IN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- j 
09 686-707 


1674 


PR0004S 


VJILM'S TUMOUR PROTEIN 
SIGNATURE 


PROC049D O.OO 7.580e- 
11 343-358 PRD0049D 
0.0C I.286e-10 342-357 


1676 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR0074 7H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.2B6e-18 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-l7 163- 
183 PR00747E 15.13 
8.244e-I5 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
326 


1677 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.5006-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 4.600e-10 
406-417 BL00678 9.67 
6.684e-09 320-331 


168a 


BL0O678 


Trp-Asp (WD J repeat 
proteins proteins . 


BL00678 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


168S 


PR00646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.188e- 
09 755-771 


1690 


BL01160 


Kmesin light chain 
repeat proteins. 


BL011603 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 
10 420-435 


169y 


PR0045* 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR004S6E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


16S2 


BL00674 


AAA-protein famiiy 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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4.46 4.000e-23 242-263 
BL00674D 23.41 e.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


16S7 


PR0040S 


PHTHALATE DI OXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PRO0466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466E 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 
09 498-517 


1699 


BL00026 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.76Se-ll 255- 
272 BL00028 16.07 
S.lS4e-ll 171-198 
BL00028 16.07 5.500e- 
11 227-244 BL00026 
16.07 1.600e-10 199- 
216 


1700 


BL01019 


ADP-ribosylation 1 actors 
family proteins,. 


BL01C19A 13.20 3.348e- 
15 62-102 BL01019E 
19.49 4.000e-l5 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


WW/rsp5/KWP domain 
proteins. 


BL01159 13.85 6.523e- 
11 232-247 BL01159 
13.85 5.408e-10 613- 
628 


1712 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 7.000e- 
10 187-203 


1713 


PF00642 


Zinc ringer C-x8-C-x5-C- 
x3-H type (and similar). 


PF00642 11.59 9.550e- 
11 230-241 


1714 


PF00642 


Zinc finger C~x8-C-x5-C- 
x3-H type (and similar) . 


PF0O642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL00353 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuroniodulin (GAP- 43) 
protezns. 


BL00412D 16.54 S.COBe- 
09 432-483 


1721 


BLQ0038 


Myc-type, 'helix- loop- 
helix 1 dimerization 
domain proteins . 


BL00038B 16.97 8.448e- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-6E 


1723 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-428 


1724 


BL01279 


Protein-L- 
isoaspartate (D- 
aspartate) 0- 
methyl transferase signa . 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL0001& 


EF-hand calciun-binding 
domain proteins . 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.l76e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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1731 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 296-350 


1732 


BL0116C 


Kinesan light chain 
repeat proteins. 


BLO1160B 19.54 9.676e- 
10 316-370 j 


1733 


PF0085C 


Histone deacetylase 
family. 


PF00850F 15.70 4.349e- 
22 246-279 PFD0850D 
14 .76 6 .850e-20 177- 
201 PF00850E 8.86 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


EL0O3 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM0O17S 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-502 


1743 


PR0O449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PR00449D 
10.79 2 .241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 


1745 


BL00720 


Guanine- nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 8.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3 .935e-10 150- 
168 


1747 


BL004 39 


Acyl transferases 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL0O439G 
13.40 2.695e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e- 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL- EIND: - 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13.92 6.571e-12 117- 
130 


1753 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 6.516e~ 
18 33-77 


1754 


3L00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL007901 20.01 6.357e- 
09 287-31* 


1756 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER M3TAL- 
BINDING NU . 


PD01066 19.43 9.750e- 
35 10-49 


1758 


DM004 06 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR 1 . 


PD02929A 28.27 4.529e- 
09 224-278 


176S 


PR00326 


GTP1/OBG GTP-BIND1NG 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.077e- 
14 523-539 


i -ilC 

i. / / b 


T3T An C A H 
BUVUzf *i Z 


glpT family of 
transporters proteins. 


BL00942F 15.07 4 .343e- 
10 371-3e9 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE -RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NC . 


DESCRIPTION 


RESULTS* 


1778 


EL.00084 


Copper type IX, 
ascorbate -dependent 
monooxycenases proteins . 


BL00084D 25.31 3.700e- 
20 169-224 BL00084E 
24.26 8.134e-16 10-56 
BL00084C 27.71 8.412e- 
11 107-I5B 


1779 


BL01013 


Oxysterol -binding 
protein family proteinB. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.891e-15 344- 
380 BL01013C $.31 
6.308e-l3 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00743 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00743 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; postion of 
signature in amino acid sequence. 
TRADOC5:14I6223.1<%CRJOI !.DOC) 
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TABLE 4 



NO: 


r- Mj*J IN Ml 1 C 




p- value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2 . le-32 


109.5 


3 


pk inase 


Eukarvotic protein kinase 
doma i n 


1 . 3e-2S 


110.7 


q 


2f -C2H2 


Zinc finger, C2H2 type 


1 .6e-21 


84 .9 


5 


fn3 


Fibronectin type III domain 


0 


1097.1 


6 


fn3 


Fibronectin type III domain 


0 


1035.0 


7 


fn3 


Fibronectin type III domain 


0 


1090.4 


6 


f n3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC oomam 


4e-40 


146.7 


10 


p4 50 


Cytochrome P4 5C 


9.5e-17 


62.0 


12 






6e-20 


79 .7 


14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


13 


•7 f -MYND 
ZX -FII IV u 




1 . 3e-06 


35.4 


16 


zf -MYND 


MYND finger 


1.3e-06 


35.4 


17 


zf -C2H2 


2inc finger, C2H2 type 


1 . 7e- 95 




18 


CAP_GLY 


CAP-Gly domain 


1.2e-25 


98.7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1 . 6e-119 


410.5 


21 


IMPDH C 


IMP dehydrogenase / t»MP 
reductase C terminus 


4 . 3 e-102 


J z> £ . b 


22 


j : 

pkinase 


Eukaryotic protein kinase 
domain 


2 . 4e-75 


277 . 0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8 . 4e-74 


258.6 


25 


RNA_j>ol_A 


UNA polymerase alpha subunit 


0 


1077 . 7 


26 


Clq 


Clq domain 


1 . 9e-10 


44.4 


27 


Ribosomal_L»2 
3 


Riboscmal protein L23 


7 . 8e-32 


111.2 


28 


Ri bosoms 1_L2 
3 


Ribosomal protein L23 


le-29 


104 . 2 


3 0 


zf -A20 


A20-like zinc finger 


1 . 5e-10 


4 8.5 


31 


2f -A20 


A20-like zinc finger 


1 . 5e-10 


AD C. 


32 


CMM /4 V, 


FMN- dependent dehydrogenase 




608.1 


34 


P1D 


Phospbotyrosine interaction 

aOmuin if laf tr ±U] 




z u y . j 


35 


ic 


Immunoglobulin domain 


1.4e-13 


48.8 


36 


19 


Immunoglobulin domain 




4 8 8 


40 


kinesin 


Kinesin motor domain 


6 .7e-76 


265.6 


4 4 


EtS 


Ets-dcmsin 


1 . 4 e- 56 


1 O £. . X 


45 


Eta 


Ets -doma in 


l.4e-56 


182.1 


46 


LRR 


Leucine Rich Repeat 


1 . 7e- 13 


58 . 3 


4 8 


zf -C2H2 


2inc finger, C2H2 type 


2 . 3 e - 162 


ceo n 


4 9 


IT AM 


Imrnunoreceptor tyrosine-based 
activation mot 


1 , 4e- 05 


31.9 


50 


VCH- 2 


Ubiguitin carboxyl- terminal 


J. . 1G £0 


1 02 . 0 


51 




UOl^JJ Lilt LdllJOAyi LcUHJIlal 

hydrolase family 


1 le- 26 


102 . 0 


52 


ras 


Ras family 


8 .5e-45 


162.3 


53 


PRK 


Phosphoribulokinase 


2 .le-65 


230.7 


54 


myb_DNA- 
binding 


Myb-like DNA- binding domain 


0.056 


15.2 


55 


voltage CLC 


Voltage oated chloride channels 


3 .3e-186 


631.9 


56 


sugar tx 


Sugar (and other) transporter 


0.00015 


-64 .3 


57 


TBC 


TBC domain 


2 . 2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


5,9e-25 


96.3 


67 


PMP22_Ciaudi 
n 


FMP-22/EMP/M?20/Claudin family 


7.9e-49 


175.6 


66 


C2 


C2 domain 


7.9e-S4 


192.2 


65 


C2 


C2 domain 


2.3e-54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 


3-9 


Immunoglobulin domain 


8.2e-28 


94.7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


i 


doma i n 






74 


pxinase 

•■\ 


Eukaryotic protein kinase 
domain 


2.8e-38 


140.6 


ie 


zt- 

C4_Jopoisom 


Topoisomercse DNA binding C4 
zinc fine 


5.4e-54 


192.8 


82 


Peptidase_S9 


Prolyl oligopepticase family 


4 .3e-lC 


36.8 


84 


in2 


Fibronectin type III domain 


4.1e-51 


183.2 


86 


SH2 


Src homology domain 2 


3.1e-22 


67.7 


88 


ic 


Immunoglobulin dome in 


0. 0091 


14.0 


09 


WD 4 0 


WD domain, G-beta repeat 


2.1e-23 


84.6 


92 


iaminin G 


Laminin G domain 


6,le-27 


98.5 


93 


AMP-binding 


AMP-binding enzyme 


2.4e-13 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


9fc 


pKinase 


Eukaryotic protein kinase 
domain 


2.6e-51 


183.9 


97 


adh short 


short chain dehydrogenase 


2e-63 


217.5 


96 


kinesin 


Kinesin motor domain 


2.2e-86 


3O0.4 


101 


IRS 


PTB domain (IRS-1 type) 


5,4e-36 


133.0 


102 


AAA 


ATPases associated with various 
cellular act 


6.8e-05 


-5.2 


104 


pxmase 


Eukaryotic protein kinase 
domain 


2.7e-73 


256.9 


106 




Ras family 


8.3e-24 


92.5 


107 


FYVE 


FYVE zinc finger 


5.4e-27 


100.7 


106 


Cyt_reductas 


FAD/NAD- binding Cytochrome 
reductase 


7.7e-Gl 


215.5 


109 


zf -C2H2 


Zinc finger, C2H2 type 


2.3e-122 


420.0 


112 


pk i nasc 


Eukaryotic protein kinase 
domain 


4e-8E 


306.2 


lie 


PK 


PH domain 


3 . le-11 


45.2 


117 


lipocalin 


Lipocalm / cytooolic fatty- 
acid binding pr 


2 .4e-14 


53.5 


ne 


pki nase 


Eukaryotic protein kinase 
domain 


4 .5e-20 


76.3 


120 


WD4 0 


WD domain, G-beta repeat 


2 .4e-14 


61.1 


123 


WD4 0 


WD domain, G-beta repeat 


2 .4e-14 


61.1 


123 


IF5 eIF4 elF 

2 


eIF4 -gamma/eIF5/eIF2-epsilon 


le-32 


122.2 


124 


ic 


Immunoglobulin domain 


6 .5e-0£ 


30.6 


127 


mito_carr 


Mitochondrial carrier proteins 


3e-16 


58.6 


12 £ 


PF2C 


Protein phosphatase 2C 


2 .2e-71 


250.6 


129 


ATP1G1 PLMJ4 
AT£ 


ATP1G1/PLM/MATB family 


3 .le-20 


80.6 


13 0 


pfkE 


pfkB family carbohydrate kinase 


4 .Se-42 


137.1 


133 


ACBP 


Acyl CoA binding protein 


4,6e-22 


86.7 


134 


rrrri 


RNA recognition motif. 


1 .2e-31 


118.5 


135 


IC 


IQ calmodulin-binding motif 


2 .6e-08 


41.0 


136 


ATP1G1_PLM_M 
AT£ 


ATP1G1/PLM/MAT8 family 


9.3e-22 


85.7 


139 


WK2 


Wiskott Aldrich syndrome 
homology region 2 


0.0067 


23.1 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1 ,7e-82 


287.5 


14: 


Peptidase_S2 
6 


Signal peptidase 3 


5.7e-lC 


35.7 


143 


arf 


ADP-ribosylation factor family 


1.2e-39 


145-2 


14€ 


KRAE 


KRAE box 


7.3e-30 


112.6 


14fc 


DUF6 


Integral membrane protein DUF6 


0.096 


8.0 


14 9 


PDEase 


3' 5' -cyclic nucleotide 
phosphod i e s t erase 


3.6e-80 


231.1 


153 


S4 


S4 domain 


l.le-06 


42.3 


153 


tRNA-synt_ld 


tRNA synthetases class I iR) 


3.8e-103 


356.1 


154 


Cytreductas 
e 


FAD/NAD-binding Cytochrome 
reductase 


7.8e-60 


212.2 


155 


ras 


Ras family 


3.6e-26 


107.0 


157 


actin 


Actin 


3.8e-26 


87.1 
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SEQ ID ' 
NO : 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


lse 


Oacalin 


Jacal in-like lectin domain 


C.05 


-24 . 9 


16C 


Zn_carbOpept 


Zinc carboxypeptidase 


5e- 136 


471 . 9 


16J: 


pkinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236 .1 




zf-C3IIC4 


Zinc finger, C3HC4 type (RING 
finger J 


5 . 3e- 07 


27.0 


166 


Ribosomal_Sl 
5 


Ribosomal protein S15 


1 . I e- 06 




165 


DEAD 


DEAD/DEAH box helicaGe 


le- 46 


157 . 0 


171 


DUF59 


Domain of unknown function 
DUF55 


0 . C7 


-17.4 


17: 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 7e- 1 5 


JO . o 


172 


globin 


Globin 


4 . 6e- 18 


o / . 4 


174 


WW 


WW domain 


7 . 3e- 06 


32 . 9 


lit 


ras 


Ras family 


le-3I I 


118.8 


176 


ATP1G1_PLM_M 
AT 8 


ATP1G1 /PLM/MAT8 family 


2.5e-17 


71 .0 


175 


2f-C2H2 


Zinc finger, C2H2 type 


1 .5e-99 


344 . 2 


18C 


Clq 


Clq domain 


e.8e-72 


251.9 


19C 


Y^phosphatas 
e 


Protein- tyrosine phosphatase 


4 .9e-287 


967 . 0 


193 


e f hand 


EF hand 


7.5e-l6 


66 .1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6 .Se-82 


285 . 6 


194 


bromodomain 


Bromodomain 


5.8e-31 


111 .4 


19b 


PALP 


Pyriccxal -phosphate dependent 
enzyme 


2.5e-64 


227 . 1 


197 


DnaJ 


DnaJ domain 


1.6e-38 


141 .4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


200 


acid_phospha 
t 


Histidine acid phosphatase 


2 .Se-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26 .5 


204 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 


205 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1 .6e-l39 


476 . 9 


206 


ldl__recept_a 


Low-density lipoprotein 
receptor domain 


2 .4e-25 


97 .6 


209 


ank 


Ank repeat 


1 .4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0 . 0035 


1 . 2 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ_con 


Ubi qui tin -conjugating enzyme 


7.4e-74 


258 . 6 


213 


UQ_con 


Ubiquit in- conjugating enzyme 


le-53 


191 .5 


215 


DEAD 


DEAD/DEAH box helicase 


1 .8e-43 


140 .4 


216 


PMP2 2_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4 . 5e-21 


83.4 


216 


Glycos trans 
f_2 


Glycosyl transferases 


4e-21 


63.6 


215 


ig 


Immunoglobulin domain 


0 . 092 


10.7 


222 


KD4 0 


WD domain, G-beta repeat 


7.4e-23 


89.4 


224 


TPR 


TPR Domain 


1 . 2e-08 


42.1 


225 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141-5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1 . 5e-38 


1 A 1 C 


229 


HSP70 


Ksp70 protein 


2 . 4 e - 54 


xyi . u 


230 


GSHPx 


Glutathione peroxidases 


3 . 4 e _ 47 


170 2 


231 


tsp_l 


Thrombospondin type l domain 


0 . 0075 


±i-± 


233 


cyclin 


Cyclin 


4 .6e-144 


492 .0 


2 34 


ras 


Ras family 




179 1 


235 


LRR 


Leucine Rich Repeat 


1 .2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6 .7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


I.7e-09 


45.0 
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SEO ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAK 
SCORE 


244 


dCMP_cyt_dea 
m 


Cyticjne and deoxycyt idylate 
deaminase 


2 .5e-05 


31.3 


245 




Immunoglobulin domain 


6 .7e-08 


30.5 1 


248 


writ 


wnt family of developmental 
signaling protei 


5 .le-270 


742.6 


250 


mito_carr 


Mitochondrial carrier proteins 


1 .3e-55 


193.6 


254 


adenylatekin 
ase 


Adenylate kinase 


1 .8e-14 


55.7 


255 


Caticn_ef flu 

X 


Cation efflux family 


2 .8e-33 


124. C 


256 


SH3 


SH3 domain 


3.9e-14 


60.4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.6e-52 


187.2 


256 


adenylate)? in 
ase 


Adenylate kinase 


2.1e-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 
Q 


PQQ enzyme repeat 


1.6e-lS 


65.0 


262 


proteasome 


Proteasome A- type and B-type 


6 .Se-64 


225 . 7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 .3e-27 


101. c 


27C 


filament 


Intermediate filament proteins 


3 .2e-150 


512 .t 


271 


Choline_kina 
se 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Riboscmal S7 


Riboscmal protein S7p/S5e 


3 .3e-20 


80.6 


272 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-77 


269. S 


280 


WD40 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


281 


WD40 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


284 


zf-DKHC 


DHHC zinc finger domain 


4 .6e-24 


93.4 


287 


Exonuclease 


Exonuclease 


1 .4e-67 


238.0 


291 


SAM 


SAM domain {Sterile alpha 
motif ) 


0 . 034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


294 


Zf -C2H2 


Zinc finger, C2H2 type 


i -4e-29 


111.7 


295 


2f-C2H2 


Zinc finger, C2H2 type 


2 .2e-125 


430.0 


296 


mi to carr 


Mitochondrial carrier proteins 


4 . le-59 


205.5 


297 


KMG_box 


HMG (high mobility group) box 


6 . 7e-29 


109 .4 


302 


Glycos trans 


Glycosyl transferase 


5e-87 


302.5 


304 


tRNA-synt_2 


tRNA synthetases class II (D, K 
and N) 


1 .le-84 


294.6 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2 .7e-44 


160.6 


3C6 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


5.2e-39 


126.1 


309 


DNA_polymera 
seX 


DNA polymerase X family 


2 .4e-64 


227.2 


31a 


F-box 


F-box domain. 


?.5e-08 


39.2 


312 


ig 


Immunoglobulin domain 


6 .8e-19 


65.9 


313 


Ets 


Ets-domain 


8.1e-60 


192.3 


315 


Kelch 


Kelch motif 


1 .3e-106 


367.6 


317 


art 


ADP-ribosylation factor family 


3 .2e-35 


130.4 


3ie 


sugar_tr 


Sugar (and other) transporter 


0.0003 


-73.1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4 .9e-81 


282.6 


32* 


XI ink 


Extracellular link domain 


4 . 5e-143 


331.5 


326 


ARID 


ARID DNA binding domain 


S.le-37 


136.4 


327 


HMG_box 


HMG thigh mobility group) box 


6.7e-29 


109.4 


328 


cadher in 


Cadherin domain 


8.1e-81 


281.5 


331 


chromo 


•chromo' (CHRroroatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase_M2 
2 


Glycoprotease family 


1.2e-i36 


467.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


"PFAK 
SCORE 


335 


vwa 


von Willebrand factor type A 
domain 


2 .3e-07 


37.? 


339 


ras 


Ras family 


7.8e-07 


-59 .1 


340 


zf-C2H? 


Zinc finger, C2H2 type 


8.2e-64 


225. 4 


342 | zf-C2Hi 


Zinc finger, C2H2 type 


2.4e-85 


297. 0 


343 


19 


Immunoglobulin domain 


0.0005 


18.0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-6S 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.: 


351 


EGF 


EGF-like domain 


8.5e-20 


79.2 


352 


ank 


Ank repeat 


2.5e-101 


350.0 


354 


TBC 


TBC domain 


5.1e-15 


63.2 


355 


PHD 


PHD- finger 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0.033 


15. & 


359 


zf-C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 .6e-34 


126 .1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4.7e-53 


189.7 


363 


efhand 


EF hand 


5 .4e-10 


46.6 


367 


LRR 


Leucine Rich Repeat 


8.8e-44 


158.5 


368 


1 amino. n_C- 


Laminin G domain 


1 .5e«33 


121 ."/ 


369 


FP2C 


Protein phosphatase 2C 


5 .3e-20 


73.9 


372 


L3M 


LIM domain containing proteins 


9.9e-15 


57.1 


373 


KRAB 


KRAB box 


4.8e-23 


90. C - 


3 76 


ion_ trans 


Ion transport protein 


2.9e-09 


-4.2 


377 


Beach 


Beige /BEACH domain 


4 .9e-208 


704 .5 


380 


pkinast 


Eukaryotic protein kinase 
domain 


1.6e-94 


327. S 


381 


AKP-binding 


AMP -binding enzyme 


1.4e-07 


-140.3 


382 


HECT 


HECT-domain (ubiquitin- 
transferase) . 


1 .3e-07 


-13 .5 


384 


ank 


Ank repeat 


2.5e-101 


350.0 


386 


ic 


Immunoglobulin domain 


9.5e-06 


23 .t 


388 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154 .6 


389 


ig 


Immunoglobulin domain 


2.8e-15 


54 . :• 


390 


mito_carx 


Mitochondrial carrier proteins 


3.5e-67 


233.2 


392 


TPR 


TPR Domain 


6.1e-17 


69.0 


393 


SH3 


SH3 domain 


3.5e-09 


43.9 


394 


AAA 


ATPases associated with various 
cellular act 


4.1e-21 


83. fc 


396 


spectrin 


Spectrin repeat 


2.1e-67 


237.3 


397 


zi -C2H2 


Zinc finger, C2H2 type 


0.0066 


23. j 


399 


fn3 


Fibronectin type III domain 


4 .le-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0.00049 


26. t. 


401 


El dehydrcc 


Dehydrogenase El component 


3e-U9 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2.1e-10 


"48.0 


405 


cadherin 


Cadherin domain 


8.le-81 


281.9 


406 


zf-cxxc 


CXXC zinc finger 


5e-15 


63.4 


410 


RhoGEF 


RhoGEF domain 


l.le-23 


92.: 


411 


F-box 


F-box domain. 


4.2e-06 


33.7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


S.8e-16 


61.6 


415 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-l72 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93.6 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G -patch 


G-patch domain 


le-19 


78. S 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.3 


426 


Plexin_repea 
t 


Plexin repeat 


0.0023 


24.6 


427 


Plexin_repea 


Plexin repeat 


0.0023 


24.6 
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SEQ ID 
NO: 


FFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 




t 








425 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8 .6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box hell case 


le-66 


214 .C 


432 


SH3 


SH3 domain 


3.4e-16 


67.2 


433 


GTP CDC 


Cell division protein 


2.1e-114 


393.5 


436 


Collage:. 


Collagen triple helix repeat 
{20 copies) 


4 .6e-194 


658.1 

1 


438 


Ricin E lect 
in 


Similarity tc lectin domain of 
ricin b 


0.0085 


10.5 


441 


Aipha_adapti 
n C 


Alpha adaptin carboxyl -terminal 
domai 


1.2e-256 


866.0 


442 


Alpha ariapti 
n_C 


Alpha adaptin carboxyl -terminal 
domai 


1 .8e-235 


795.1 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.9e-65 


230. S 


445 


LrON 


ATP-dependent protease La (LON) 
domain 


0.00012 


-17.1 


446 


ac 


Immunoglobulin domain 


0 .00011 


20. j 


,451 


sushi 


Sushi domain (SCR repeat] 


1 . 4e-18 


75.2 


452 


fn3 


Fibronectin type III domain 


1 . 5e-06 


35.2 


454 


pyridoxal de 

c 


Pyridoxal -dependent 
decarboxylase conse 


8 .3e-14 


50.3 


456 


kinesin 


Kinesin motor domain 


4 .9e-217 


734.4 


457 


neur char. 


Neurotransmi t ter-gated ion- 
channel 


le-175 


597 .1 


458 


Josephin 


Joeephin 


0 . 0002 


18.7 


468 


bZIP 


bZIP transcription factor 


1.7e-07 


31.6 


470 


NTP transfer 
ase 


Nucleotidyl transferase 


6 .3e-06 


-26.3 


473 


WD40 


WD domain, G-beta repeat 


2e-28 


107. 9 


473 


LIK 


LIM domain containing proteins 


0.00021 


20.7 


477 


zf -RarvBP 


Z>n- finger in Ran bindino 
protein and others - 


0 . 028 


21 . 0 


479 


WD4 0 


WD domain, G-beta repeat 


6 .5e-18 


73 .C 


480 


KRAE 


KRAB box 


le-31 


11B. 8 


481 


Arf Gap 


Putative GTP-ase activating 
protein for Arf 


8.4e-66 


232 . 0 


j 485 


SH^ 


Src homology domain 2 


0.011 


11 .4 


486 


Cic 


Clq domain 


4 .3e-74 


2S9. 6 


487 


dsm« 


Double- stranded RNA binding 
motif 


l.le-47 


171 .9 


489 


zf -C2H2 


Zinc finger, C2K2 tyoe 


4 . 86-153 


521 . 9 


490 


Alpha_adapti 
n C 


Alpha adaptin carboxyl -terminal 
domai 


3.4e-222 


751 .6 


492 


SKI 


Shikimate kinase 


1 .2e-10 


48 . 6 


497 


ENV_jpolyprot 
ein 


ENV polyprotein (coat 
polyprotein) 


2.6C-22 


77.6 


498 


abhydrolase_ 
2 


Phospholipase/ Carboxyl esterase 


0.041 


-48.1 


500 


rrn. 


RNA recognition motif. 


5.4e-34 


126.4 J 


501 


WW 


WW domain 


4.6e-18 


73 .4 | 


502 


la 


Immunoglobulin domain 


l.le-10 


39.5 | 


504 


abhydrolase 


alpha/beta hydrolase fold 


0.045 


j 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219 .0 


508 


Na_K_ATPase 
C " 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuclease 


Exonuclease 


1.3e-56 


201 .5 


510 


Glycos trans 
fj 


Glycosyl transferases group 1 


2.9e-06 


27.0 


511 


Glycos trans 
f_2 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl trans i erases group 1 


1 ,9e-09 


38. b 


514 


pro_isotneras 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


1.8e-63 


221 .4 
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SEQ ID 
NO: 


PF AM NAME 


DESCRIPTION 


p-value 


PKAM 
SCORE 


515 


EGF 


EGF- like domain 


1.9e~18 


74 .7 


sie 


Surp 


Surp module 


4.3e-36 


140.0 


523 


io 


Immunoglobulin domain 


3.3e-06 


25.0 


526 


UBX 


UBX domain 


l.le-34 


126.6 


52 E 


adhzinc 


Zinc-binding dehydrogenases 


2.7e-34 


127.4 


53 C 


SAM 


SAM domain {Sterile alpha 
motif ) 


0.04 6 


10.0 


531 


adh_ short 


short chain denydrogenase 


0.0025 


-34.1 


532 


mito__carr 


Mitochondrial carrier proteins 


2.5e-83- 


281.7 


533 


mito^ carr 


Mitochondrial carrier proteins 


2e-63 


213.5 


53 4 


thiolase 


Thiolase 


3.5e-l83 


622.0 


535 


FMO-like 


Flavin -binding monooxygenase- 
like 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-Sb 


196 .6 


53 7 


tRNA-synt_l 


tRNA synthetases class I (I, l>, 
M and V) 


3. le-136 


466 .0 


53 6 ! 


tRNA-synt_l 


tRNA synthetases class I (I, 
M and V} 


3 .le-136 


466.0 


539 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


1.9e-117 


403.6 


540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3. le-136 


466 .0 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295.7 


543 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-69 


242.6 


544 


DUF101 


Protein of: unknown function 
DUF102 


8.5e-38 


139.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238,2 


547 


WD4 0 


WD domain, G-beia repeat 


2.6e-32 


120.8 


54H 


RHE 


Rel homology domain (RHD) . 


. i.6e-238 


686.2 


549 


MMR_HSR1 


GTPase of unknown function 


5.4e-67 


236.0 


551 


HECT 


HECT-comain (ubiguitin- 
transf erase) - 


4.3e-127 


435.6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3.5e-74 


259. e 


555 


2f-UBRl 


Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-25 


109.7 


561 


AMP-binding 


AMP -binding enzyme 


2.8e-06 


-163.7 


562 


PABP 


Poly-adenylate binding protein, 
unique domai 


4 .9e-38 


139.8 


564 


Gag_p30 


Gag P3 0 core shell protein 


1.2e-67 


238.2 


566 


PWWP 


PWWP domain 


8.1e-16 


66.0 


567 


SCAN 


SCAN domain 


7.3e-66 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


J.5e-84 


294.3 


570 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hydrclase 


Carbon -nitrogen hydrolase 


0.00081 


-79.7 


572 


myosin_head 


Myosin head (motor domain) 


0 


1495.2 


573 


myosm_head 


Myosin head (motor domain) 


0 


1490.4 


575 


Surp 


Surp module 


1.7e-23 


91.5 


576 


Surp 


Surp module 


1.7e-23 


91.5 


577 


DNA_jpol_E 


DNA polymerase family B 


0 


1138.6 


578 " 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


8.3e-05 


42.7 


579 


LRR 


Leucine Rich Repeat 


4 .9e-21 


83.3 


580 


neur_cnan 


Neurotransmitter-gated ion- 
channel 


5.9e-l77 


601.3 


583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116.3 


586 


KH-domain 


KH domain 


2.9e-13 


57 .5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


585 


LIK 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114.7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114.7 
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SEO ID 

NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


5$; 


hormcne_rec 


Ligand- binding domain of 
nuclear hormone 


3 .5e-22 


87.1 . 


592 


PHD 


PHD-fingei 


3 .8e-12 


53.8 


594 


cadherin 


Cadherin contain 


4.2e-9S 


342.7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-V> 


319.2 ■ 


597 


WD4 0 


WD domain, G-beta repeat 


D.0OC54 


26.7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-52 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2.3e-86 


300.4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-4: 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 . 


608 


PWWF 


PWWP domain 


2.6e-28 


107.5 


60S 


PWWP 


PWWP coma in 


2.6e-28 


107.5 


613 


CAP GLY 


CAP-Gly domain 


0.0046 


20.1 


615 


RFX_DNAjaind 
ing 


RFX DNA- bind ing domain 


S.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


l.le-E2 


284 .8 


617- 


kinesin 


Kinesin motor comair. 


8 .4e-eo 


278 .5 


61 6 


2f -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0096 


13.1 


620 


MATH 


MATH domain 


7 .8e-Cb 


22.2 


623 


y__phospha tas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121 .6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 .4e-4C 


146.6 


623 


BNR 


BNR repeat 


2 . le-ll 


SI. 3 


624 


mclybdopteri 
D 


Prokaryotic molvbdopterin 
oxidoreduc ta s 


1 . 4e-12 


42.2 


62S 


TPK 


TPR Domain 


l.le-17 


72.2 


627 


cNMP binding 


Cyclic nucleotide-binding 
domain 


3 ,7e-5E 


206 .6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf -C2H2 


Zinc finger, C2H2 type 


2 .le-88 


307.1 


632 


rrm 


RNA recognition motif. 


4e-0S 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 6e-104 


360.7 " 


63 6 


Fork head 


Fork head domain 


5 . 9e-27 


103 .0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3.8e-70 


246.5 


642 


TPR 


TPR Domain 


4 . 8e-08 


40.1 


643 


ef hand 


EF hand 


1.9e-27 


104.6 


647 


SNF2_JJ 


SNK2 and others N-terminal 
domain 


1.2e-102 


351.1 


64B 


FseudoU synt 
h_2 


RNA pseudour idyl ate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0.0067 


22.7 


651 


ank 


Ank repeat 


1.3e-17 


■71". 9" 


652 


2_>WEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur^chan 


Neurotransmitter-gated ion- 
channel 


4.ie-l7l 


581.8 


654 


tsp_i 


Thrombospondin type 1 domain 


4 .le-47 


169.9 


659 


FH2 


For mi n Homology 2 Domain 


ie-107 


371.2 


661 


pou 


Pou domain - N-terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


■76.2 


664 


C2 


C2 domain 


6.7e-19 


76.2 


667 


GST 


Glutathione S- transferases . 


9.3e-34 


114.4 


666 


LRR 


Leucine Rich Repeat 


9.3e-3i 


115.6 


670 


spectrin 


Spectrin repeat 


4e-5*7 


203.2 


672 


I_LHEQ 


I/LWEQ domain 


9.5e-101 


341.0 


672 


ABCtran 


ABC transporter 


S.3e-60 


212.8 


674 


WD4C 


WD domain, G-beta repeat 


4 .8e-24 


93.3 
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SEO ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


6 75 


WD4 0 


WD domain, C— beta repeat 


4 .8e-24 


93.3 


67fc 


LRR 


Leucine Rich Repeat 


0.0015 


25.2 


679 


zf-CCCH 


Zinc finger C-x8-C-xS-C-x3-K 
type 


2.6e-29 


107.7 


680 


zf-C2K2 


Zinc fmger, C2H2 type 


5.2e-05 


30.1 


681 


CK 


Calponin homology (CH) domain 


2 .4e-17 


71.1 


682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4 .3e-43 


156.6 


683 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.051 


10. e 


687 


Synapsin 


Synapsin 


0 


1890.8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038.8 


651 


homeobox 


Homeobox domain 


8 . Se-30 


112 .4 


656 


Peptidase_M2 
4 


metallopeptidase family M24 


2.6e-59 


210.5 


657 


RhoGEF 


RhoGEF domain 


9.5e-35 


128.9 


696 


PHD 


PHD- finger 


0.006 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


Sulf atase 


Sulfatase 


3e-231 


781.6 


703 


zf -C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707. 


Acyl_tranef 


Acyl transferase domain 


1 .le-22 


88.6 


706 


WD4 0 


WD domain, G-be ta x'epeat 


4 .8e-15 


76.7 


710 


Ran_BPl 


RanBPl domain. | 


8 .4e-06 


-7.3 


712 


DEAD 


DEAD/DEAH box helicase 


9.9e-42 


134 ,9 


714 


PK 


PH domain 


1 .6e-0S 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1 .5e-3? 


i3e.2 


717 


Sialyltransf 


Sialyltransf erase family 


7 .5e-3l 


115.9 


71B 


ic 


Immunoglobulin domain 


le-25 


100.8 


715 


inteqrin B 


Integrins, beta chain 


0 


1125.4 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


1 .le-08 


32.4 


722 


Peptidase_C2 


Calpain family cysteine 
protease 


3e-145 


495.9 


723 


ic 


Immunoglobulin domain 


2 .2e-0S 


22.4 


720 


F-box 


F-box domain. 


0 .007 


23 .C 


726 


Nop 


Putative snoRNA binding domain 


8.le-58 


205.5 


726 


Nop 


Putative snoRNA binding domain 


8. ie-S8 


205.5 


727 


WD4 0 


WD domain, G-beta repeat 


7 .Se-26 


99.3 


73C 


dsrm 


Double- stranded RNA binding 
motif 


0 .027 


12.1 


731 


dynamin j Dynamin family 


4 .26-16 


66.9 


733 


Zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2 .8e-lG 


41.7 


735 


CDP- 

OH_P_transf 


CDP-alcohoi 

phosphatidyl transferase 


4.2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8 .6e-57 


182.5 


735 


TSC22 


TSC-22/dip/bun family 


6.5e-32 


119.5 


74 2 


ras 


Ras family 


2.2e-10O 


346.9 


74 3 


PMl_typeI 


Phosphomannosc i some rase type I 


1.2e-243 


822.9 


747 


trypsin 


Trypsin 


6.4e-8B 


279.4 


7<je 


kazal 


Kazal -type serine protease 
inhibitor domain 


2.2e-S2 


187.4 


74 5 


efband 


EF hand 


6.3e-06 


33.1 


751 


PHD 


PHD-f inger 


4 .Se-16 


66.7 


752 


zf-C2E2 


Zinc finger, C2H2 type 


3 .2e-21 


83 .9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754 


Ribosomal L3 
9 


Ribosomal L3 9 protein 


0.00018 


26.7 


75b 


PH 


PH domain 


3.6e-14 


55.7 


756 


SCAN 


SCAN domain 


1.4e-53 


191.5 


755 


PA 


PA domain 


0.0065 


23.1 


760 


arf 


ADP-ribosylation factor family 


2.2e-l9 


77.8 


761 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 
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ceo t n 
NO: 


PFAM NAME 


S CcilPTl ON 


p- value 


PFAtf 
SCORE 


If '■ 

/Oi 


hi stone 


Lore niscone hzo/h i / 


9 • Se-53 


188.6 






M y jn jj x j rip e x 


4 . le-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


ie-52 


188.6 


7 67 


vwc 


von wineDrano ioCtOi type v_ 

(JUllid XI. 


2 . Se-34 


1 0*7 "3 1 


769 






. 

4 . 8e - 13 




770 


zf -C4 


7 1 rtr* f inn^T* Cd t~ >at"*^ (►un 

doni&ins } 


2 . 4 e- 53 


1 0 X . D 


772 




Has family 


7e - 90 


312.0 


773 


Sulf atase 


Sulf atase 


le-142 


487.5 


775 


zf -C2H2 


iii.uci , \,£.v\.c. i_ype 


1 le-12 


DO . o 


776 


zf - C2K2 




1 le— 12 


cc c 

J3 . _> 


in 
tit 


•7 f - row? 


z»mc ungexi ^.^n/ type 


1 . le - 12 


55 . 5 


77 8 




RNA recognition motif. 




121.1 


779 


G6PD • 


Glucose- 6- phosphate 
dehydrogenase 


1.5e-76 


236.6 


780 




Spectrin repeat 


3 . 7e-29 


110 . 3 




mi to carr 


Mitochondrial carrier proteins 


4 . 6e -57 


198.5 


TOT 




SCAN aomain 


1 . 3e-24 


95 . 2 


781' 


PD2 


PDZ domain (Also known as DHR 
or GLGF} • 


4 . ie-07 


37.1 


76b 


DEAD 


DEAD/DbAH dox nei lease 


6e - 06 


m *> 

21 » 7 


/DC 


ras 


Ras family 


5 . 3e-39 


14 3.0 




T"} Hi -~. _ a T IT ^ 


Ribonuclcase HI1 


2 . 5e-67 


237 . 1 


790 


PlJ_PI4__xina . 
se 


Phosphat idylincsitol 3- and 4- 
kinases 


5 . 4e-10£ 


372 . 2 


795 


cadherin 


Cadherin domain 


2 . Se-40 


147.4 


796 


ARID 


ARID DNA binding domain 


1 . 6e-20 


81 . 6 


797 


trypsin 


Trypsin 


9 . 9e-20 


64 . 8 


799 


CH 


Calponin homology (CH) domain 


3 . 7e-15 


63 . B 


803 


Gal- 
bind lectin 


Vertebrate galactoside-binding 
lectin 


4 .le-25 


88 .7 


803 


WD40 


WD domain, G-beta repeat 


0 . 00082 


26.1 


806 


TBC 


TBC domain 


1 . 8e-26 


101 . 4 


807 


TBC 


TBC domain 


1 . 8e-26 


101 . 4 


806 


CN__hydrola s e 


Carbon-nitrogen hydrolase 


8 . Be - 80 


278.5 


813 


CB?DJNFYB_HM 
F 


His tone- like transcription 
factor 


6e-i4 


59.8 


811 


adh_short 


short chain dehydrogenase 


8 . 1 e- 20 


79 . 3 


814 


IMP4 


Domain of unknown function 


3 .3e-71 


250.0 1 


81^ 


zf -C2H2 


Zinc finger, C2H2 type 


8 . 2 e - 6 6 


232 . 1 


Bit 


Pept_tRNA_hy 
dro 


Pep t idyl -tRUA hydrolase 


1 .6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2 . Se- 18 


74 . 3 


826 


IFS_eIF4_elF 


eIF4-gamrna/eIF5/eIF2-epsilon 


1.6e-32 


121.5 




Arf Gap 


Putative GTP-ase activating 
protein for Arf 


1 . Se-53 


191 . 3 


OJ - 


T ,UD 

LirtK 


Leucine Rich Repeat 


2 - le- 26 


101 , 1 


832 


laminin_EGF 


Laminin EGF-like (Domains III 
ana v j 


2e-57 


204.2 


839 


rrrn 


RNA recognition motif. 


1.3e-22 


88.5 


84 0 


i pnospnatas 


Protein- tyrosine phosphatase 


2 . 6e-119 


4C9 ! 8 


841 


pkinase 


Eukaryotic protein kinase 


3.4e-100 


346.3 


844 


Ribosomal_L2 
2e 


Ricosomal L22e protean family 


le-64 


228.4 


846 


TOD 


I BR domain 


9e-15 


62 . 5 


849 


zf -C3H C4 


Zinc finapr P1HP4 rvne (RING 
finger) 


7 . 4e -07 


26.5 


85C 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger} 


0.00016 


18.9 


853 


SET 


SET domain 


5e-30 


113.2 


0S2 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 
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SEQ ID 
NO: 


PrAM NAME j DESCRIPTION 
i 


p-value 


PFAM 
SCORE 




rich domain 






853 


SRCR 


Scavenger receptor cysteine - 
rich domain 


0 


1025.4 


857 


lactamase E 


Metal lo-beta -lactamase 
superf amily 


0.012 


-6 . 0 

i 


958 


COX6A 


Cytochrome c oxidase subunit 
Via 


3 . 4e-58 


206 . 7 


855 


rrm 


RNA recognition mot it . 


5 . 4e-45 


162 . 9 


861 


?RK 


Phosphoribulokinase 


5 . le-62 


219 . 4 


863 


mito_carr 


Mitochondrial carrier proteins 


■s ""in*. CI 
s. . 9e-b i 


185.5 


364 


HSP90 


Hsp90 protein 


4.7e-158 


538.5 


866 


IO 


Immunoglobulin domain 


4e-12 


44 . j 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


hi stone 


Core histone H2A/H2B/H3/H4 


4 .9e-41 


149. e 


874 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CFSase) 


2.1e-218 


739. C 


879 


Ribosomal_Sl 
2e 


Ribosomal protein S12e 


2.1e-98 


340.3 


882 


serpln 


Serpins (serine protease 
inhibitors) 


2.5e-42 


145. 7 


883 


Patatin 


Patatin 


1.2e-Sl 


182 . C 


884 


RA 


Ras association (RalGDS/AF-6) 
domain 


0.044 


8.C 


887 


DUF92 


Integral membrane protein DUF92 


2.7e-12 


54 .3 


889 


sugar_tr 


Sugar (and other) transporter 


8.2e-63 


222.2 


893 


DUF2 8 


Domain of unknown function 
DUF26 


1.3e-43 


158. 3 


696 


lP_trans 


Phosphatidyl inositol transfer 
protein 


6.5e-98 


338.7 


898 


DEAD 


DEAD/DEAH box helicase 


1. 5e-48 


156 .5 


899 


KE2 


KE2 family protein 


7e-61 


215 .7 


900 


KE2 


KE2 family protein 


4.3e-51 


183.2 


901 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203. e j 


902 


ras 


Ras family 


2.3e-75 


263 . e 


904 


TPR 


TPR Domain 


3.2e-22 


87.2 


906 


GEP 


Guanylate-binding protein 


8.9e-253 


853.1 ! 


907 


GBP 


Guanylate-binding protein 


l.le-239 


809.6 


908 


WD4 0 


WD domain, G-beta repeat 


2.6e-26 


100.6 


909 


PK 


PH domain 


1.3e-09 


39 . 4 


910 


zf-C2H2 


Zinc finger, C2H2 type 


2.Se-39 


144.1 


913 


Epitnerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


-88.5 


921 


TBC 


TBC domain 


i.5e-09 


30.7 


922 


WD4 0 


WD domain, G-beta repeat 


1.6e-25 


98.2 


923 


WD40 


WD domain, G-beta repeBi 


8.2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-Iike 
hydrola se 


2 . 9e-05 


29.1 


925 


UQ_con 


Ubi qui tin -conjugating enzyme 


0. 00033 


-27 .6 


926 


CH 


Calponin homology (CH) domain 


3 .3e-53 


190.2 


928 


WD4C 


WD domain, G-beta repeat 


5.9e-46 


172 .7 


929 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3.1e-10 


37.4 


930 


Ribul_F_3_ep 
in 


Ribulose-phosphate 3 epimerase 
family 


7 . 2e-105 


361 .£ 


931 


Ribul_P_3_ep 
icn 


Ribulose-phosphate 3 epimerase 
family 


1.2e-96 


334 .4 


936 


C2 


C2 domain 


2 . 2e-62 


220 . 7 


937 


NAP_family 


Nucleosome assembly protein 
(NAP) 


l.le-22 


84.6 


940 


abhydrolaee 


alpha/beta hydrolase fold 


0 . 011 


3 . 1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


94 8 


pkinase 


Eukaryotic protein kinase 
domain 


3 . 4e-75 


263.2 


949 


WD4 0 


WD domain, G-beta repeat 


1 .8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyl transferase 


1.6e-07 


38.4 
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SEQ ID 
NO: 


PPAM NAME 


DESCRIPTION 


p- value 


PFAN 
SCORE 


951 


SAM 


SAM domain (Sterile alphc. 
motif) 


0.014 


14 .£■ 


954 


GFO IDH MocA 


Oxidoreductase family 


l-.3e-ll 


52,0 


955 


BTB 


BTB/POZ domain 


7e-22 


86 .2 


956 


BTB 


3TB/ POZ domain 


7e-22 


86.: 


957 


CDP- 

OH_P_transf 


CDP- alcohol 

phosphatidyl transferase 


6. 053 


-22.1 


959 


ras 


Ras family 


2.4e-97 


336. £ 


960 


ras 


Ras family 


8.4e-43 


155. t 


961 


Acetyltransf 


Acetyltransf erase (GNAT) tamily 


1.2e-08 


42 .2 


962 


adh_short 


short chain dehydrogenase 


2.4e-31 


117 A 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.; 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653 


970 


RNase__Ptf 


3' exoribor.uclease family 


9e-24 


92.4 


975 


WW 


WW domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


3 .6e-21 


83.7 


978 


Ribosomal_Ll 
7 


Ribosomal protein LI 7 


2.4e-20 


81. C 


979 


L1M 


LIM domain containing proteins 


5.8e-42 


152.1 


980 


Caleeouestri 
n 


Calseguestrin 


1.7e-297 


1001.7 


982 


HSP20 


Hsp20/alpha crystallin family 


1.2e-10 


43.S 


983 


oxidored_g6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4 .8e-63 


222. S- 


988 


TBC 


TBC domain 


2 .2e-50 


180.1 


989 


TBC 


TBC domain 


2.2e-50 


180. t 


993 


tRNA_int_end 
o 


tRNA intron endonucO ease 


0.0017 


-34.2 


994 


homeobox 


Homeobox domain 


4e-18 


73. £ 


997 


pyr redox 


Pyridine nucleotide- di sulphide 
oxidoreducta 


0 .012 


11.6 


1000 


mito_carr 


Mitochondrial carrier proteins 


9.7e-123 


421 .; 


1001 


RA 

o 


Ras association (RalGDS/AF-6) 
domain 


1. '26-15 


65.4 


1004 1 


DUF81 


Domain of unknown function 
D0T81 


0.099 


10.2 


1005 


actin 


Actin 


1 .3e-l74 


574 .3 


1006 


actin 


Actin 


3 .le-130 


428. € 


1007 


cpn6 0_TCPl 


TCP-l/cpn60 chaperonin family 


3 .7e~195 


661. e 


1008 


TPR 


TPR Domain 


8 .le~44 


159.0 


1009 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e~61 


216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1012 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4 .7e-15 


53.1 


1016 


tRNA-eynt_2c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


1018 


RhoGAP 


RhoGAP domain 


1 .6e-78 


274.3 


1022 


PGAM 


Phosphoglycerate mutase family 


3.8e-18 


69.7 


1026 


HMG_box 


HMG (high mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162. b 


1028 


UQcon 


Ubiquitin- conjugating enzyme 


1 .4e-49 


178.3 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0.028 


16.3 


1034 


Hydrolase 


haloacid dehalogenaoe-like 
hydrolase 


2e-21 


84.6 


1037 


KRAB 


KRAB box 


4.Be-06 


32.4 


1038 


Cationjaf flu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NAD:arginine ADF- 
ribosyltransf erase 


4 .7e-47 


169.1 


1042 


WD40 


WD domain, G-bets repeat 


1.9e-l8 


74.7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 .7 


1045 


lectin_c 


Lectin C~type domain 


1.9e-28 


108 .0 


1046 


Glucosamine^ 
iso 


Gl ucosamine- 6 -phosphate 
i some rase 


0.00013 


-25.3 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAK 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4 .5e-8c 


279 . 4 


1049 


1C 


Immunoglobulin domain 


1 . 7e- 09 


35 .£ 


1050 


Ribosomal_L2 
4e 


Ribosomai protein L24e 


2e-32 


124 . 5 


1054 


Amidase 


Amidase 


4 .3e-lS2 


518 . 1 


1055 


rrm 


RNA recognition mot it. 


3 . Be-26 


100 .3. 


1058 


annexin 


Annexin 


6.9e-44 


is9.; 


1059 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Cloudin family 


0.023 


-23 .t 


1060 


home ob ox 


Homeobox domain 


3 . 2e~2l 


117 .>. 


1062 


Acyl transfer 
ase 


Acyltransf erase 


0 . 00065 


10 . 1 


1064 


AMP-binding 


AMP-binding enzyme 


6 . 6e- 100 


345.3 


1065 


LRR 


Leucine Rich Repeat 


3 . 3e-14 


60 . 6 


1066 


G7Pl_OBG 


GTP1/OBG family 


4 ,8e-41 


141 . t 


1071 


1C 


Immunoglobulin domain 


8 , 4e-48 


159 - 3 


1072 


PHD 


PHD- finger 


6.8e-07 


36 .3 


1074 


DEMN 


DENN (AEX-3) domain 


8 *3e-33 


121 - b 


1075 


SCP 


SCP-like extracellular protein 


4 . 7e-4 j 


149 - h 


1077 


OLF 


Olfactomedin-like dcmaar. 


2.2e-6£ 


234. 0 


1078 


mito__carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6.2e-45 


162.' 


1007 


START 


START domain 


1.5e-4£ 


174 .7 


1093 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


3.3e-62 


223.4 


1094 


GSHPx 


Glutathione peroxidases 


9. 6e-41 


148.6 


1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


264 .0 


1096 


.DUF25 1 


Domain of unknown function 
DUF25 


6e-75 


262.4 


1105 


Ni troreducta 
se 


Nitroreductase family 


1.3e-13 


58.6 


1106 


PTE 


Phosphotriesterase family 


1 .3e-179 


610.1 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
doma in 


0.00045 


19. e 


1109 


ras 


Ras family 


1.3e-15 


40.7 


1115 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


9.7e-47 


168.7 


1116 


HMG14_17 


KMG14 and HMG17 


4 .4e-21 


63 .5 


1117 


HMG14_17 


HMG14 and HMG3.7 


9.9e-12 


52 .4 


1119 


FAA_hydrolas 
e 


Fumarylacetoacetate (FAA) 
hydrolase fam 


2C-83 


290.6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 


1123 


abhydrolase 


alpha/beta hydrolase fold 


9.2e-23 


£9 . 0 


1129 


proi some ras 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2.2e-56 


197 . 3 


1131 


DnaJ 


DnaJ domain 


1.6e-30 


114 .9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78 . 6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64.9 


1134 


PH 


PH domain 


0.0015 


17 . e 


1136 


Acapcomp su 
b 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap comp^su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708 .£ 


1139 


ras 


Ras family 


1.5e-86 


301 . 0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258 . 4 


1152 


Acyl transfer 
ase 


Acyltransf erase 


1.2e-05 


29.9 


1153 


IRS 


PTB domain (IRS-l.type) 


5.4e-55 


196 . 1 


1155 


xg 


Immunoglobulin domain 


1 ,3e-31 


lUo . :? 


1157 


Asparaginase 
__2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductases 


4.7e-142 


485.3 


1160 


zf-ANl 


ANl-like Zinc finger 


0.00021 


27.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-valut 


PFAM 
SCORE 


1162 


linker^histo 
ne 


linker histone Hi and H5 family 


3.6e-14 


60.4 


1164 


DEC 


Death effector domain 


3 .9e-0 = 


30.5 


1165 


IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157.3 


11 6£ 


IRS 


PTB domain (IRS-1 type) 


2.6e-43 


157.3 


1166 


SAM 


SAM domain (Sterile alpha 
motif) 


0.04 


10.5 " 


117C 


abhydrolase 


alpha /bet a hydrolase fold 


O.C96 


-7.5 


1174 


SAP 


SAP domain 


3.9e-10 


47.1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-33 


112.5 


1176 


KD4 0 


WD domain, G-beta repeat 


4.7e-35 


129.9 


118C 


EtS 


Ets-domam 


1.8e-09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0. 00016 


24 .7 


1162 


TCl»l_MTCPl 


TCM/MTCP1 family 


9.5e-56 


198.6 


1184 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


1185 


mito_carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR_LY6 


u-PAR/Ly-6 domain 


0.0042 


15.6 


1186 


Orn_DAP Arg 
dec 


Py r i doxa 1 - depe n aen i 
decarboxylase 


6.2e-l2e 


430.6 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1155 


Seel 


Seel family 


3.2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide -disulphide 
oxi.doreducta 


3.1e-32 


111.8 


1197 


Giyco_transf 

^6 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 


1203 


adh_short 


short chain dehydrogenase 


8.3e-45 


162.3 


1206 


Ubie^methylt 
ran 


ubiE/COQS methyltransf erase 
family 


1.3e-12l 


417.4 


1208 


7tm 3 


7 transmembrane receptor 


7.2e-09 


29.0 


120S 


ank 


Ank repeat; 


3.9e-15 


63.7 


121C 


vATP- 
synt_AC3 9 


ATP synthase <C/AC39) subunit 


2.5e-l28 


439.7 


1212 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


efhanc 


EF hand 


3.2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


1. 5e-71 


251.1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 


px domain 


2.2e-15 


64 .5 


1233 


PX 


PX domain 


2.2e-15 


64.5 


1236 


FCH 


Fes/ClP4 homology domain 


3 .3e-0S 


44.0 


1241 


Peptidase_M2 

0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 


UPF0 006 


Metalloenzyme of unknown 
function UPF0006 


6.3e-6l 


215.8 


1248 


Glycos trans 
£_2 


Glycosyl transferases 


4 ,5e-10 


46.9 


1249 


efhand 


EF hand 


4e-ll 


50.4 


12S4 


UQ__con 


Ubiqui tin- conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


tormyl trans 
f 


Forniyl transferase 


4.9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1263 


DiHfolate re 
c 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G_gl-u_transp 
ept 


Gamma-glutamyl transpeptidase 


1.8e-110 


380.4 


2263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4 .2e-22 


86.9 
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SEO ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


126< 


SCF 


SCP-like extracellular protein 


6e-2f 


108.0 


126"/ 


K_tetra 


K+ channel tetramerisation 
domair. 


2.8e-27 


104.0 


126^ 


ras 


Ras family 


1.3e-85 


297.9 


127£ 


2f-C3KC4 


Zmc finger, C3HC4 type (RING 
finger • 


4.2e-lC 


37. C 


127£ 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.6 


1277 


abhydrolase 


alpha /beta hydrolase fold 


5.6e-23 


83.1 


127$ 


trypsin 


Trypsar. 


4 .4e-42 


132.0 


1280 


PBP 


Phosphatidylethanolamine- 
binding protein 


1.3e-13 


58.7 


128L 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Ank repeat 


1.7e-52 


187.8 


1296 


£n3 


Finronectin type III domair. 


0. 02fc 


20.9 


129E 


GBP 


Guanylate -binding protein 


0.00026 


-70.0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3.2e-14 


60.7 


129£ 


MM 


LIK domain containing proteins 


5,8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4 . 9e-43 


145.2 


1307 


nuto_carx 


Mitochondrial carrier proteins 


2.1e-5> 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


i.6e-17 


71.6 


131C 


UPAR LY€ 


u-PAR/Ly-6 domain 


7.1e-2C 


75.5 


1313 


thiorec 


Thioredoxin 


3.6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


3316 


trypsin 


Trypsin 


4 .4e-43 


132.0 


1320 


Rib08omal_Ll 
3 


Riboscmal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
9 


Armadil lo/beta-catenin-like 
repeat f 


0.0054 


23.4 


I32fc 


KRAB 


KRAB box 


O.OSi 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-4G 


147.7 


1330 


Bcl~2 


Apoptosis regulator proteins, 
Bel -2 family 


0.014 


-1.6 


1333 


PX 


PX domain 


2.le-lC 


48-0 


1333 


KRAB 


KRAB bo. 


1.8e-36 


234.6 


1334 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


2.3e-8£ 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1.8e-59 


211.0 


1336 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.2e-3l 


118.6 


1337 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2.3e-12 


54. 5 


1336 


TPR 


TPR Domain 


0.00021 


28.1 


1340 


metal thio 


Metallothionein 


0.013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5.8e-09 


36.5 


134 3 


Band 4 3 


PERM domain (Band 4.1 family) 


1.3e-3e 


122.5 


1344 


Kelch 


Ke)ch motif 


1.4e-44 


161.5 


134b 


Antifreeze 


i Antifreeze protein 


1 .2e-10 


48.8 


1347 


3Beta_HSD 


3 -beta hydroxysteroid 
dehydrogenase/isomera 


0.086 


-177.2 


1348 


BTB 


BTB/POZ domain 


5,3e-28 


106.5 


134S 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.7 


11352 

| 


Nramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686,6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


| 1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 


209.0 


| 1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203 .1 


1360 


zf-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HKG14 and HKG17 


7.9e-4C 


145.7 
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SEO ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


136S 


SIS 


SIS domain 


3 . 6e-30 


113.6 


1363 


SIS 


SIS domair. 


1.3e-28 


108.5 


1364 


ig 


Immunoglobulin domain 


0.00026 


19.0 


1368 


K_tetra 


K+ channel tetramerisation 
domain 


1 , le-16 


68 .9 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2 .2e-113 


390 . 1 


1372 


DnaO 


Dnau domain 


6 . 6e- 36 


132 . 7 


1376 


KRAE 


KRAB box 


2 . le-38 


141 . 0 


1378 


ELM2 


ELK2 domain 


2e-23 


91.3 


136C 


thiored 


Thio redox in 


1.2e-23 


82 . 8 


1361 


ank 


Ank repeat 


2 . 3e-83 


290 .4 


1362 


BTB 


BTB/POZ domain 


3e-ll 


50.8 


13 63 


WD40 


WD domain, G-beta repeat 


1.6e-19 


78 .3 


1364 


WD40 


WD domain, G-beta repeat 


6 . 3e-24 


92 .9 


1367 


zt -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-09 


35.6 


138S 


zf -C2K2 


Zinc finger, C2H2 type 


5.5e-5C 


179.5 


13 SO 


2f-C2H2 


Zinc finger, C2H2 type 


2 . 5e-85 


296 . 9 


1393 


kinesin 


Kinesin motor domain 


7. 8e-188 


637.4 


1354 


zf-C2H2 


Zinc finger/ C2H2 type 


1. 2e-49 


178.4 


1356 


KRAB 


KRAB box 


5.1e-22 


86.6 


14C2 


bZIP 


bZIP transcription factor 


0.035 


13 .1 


1405 


sugar_tr 


Sugar (and other) transporter 


0.003 


-101 .5 


14 00 


RhoGAP 


RhoGAP domain 


B.9e-4 7 


168.8 


1407 


rrm 


RNA recognition motif. 


le-35 


132.1 


140£ 


LRR 


Leucine Rich Repeat 


2.ie-13 


58.0 


1405 


Nebulin_repe 
at 


Nebulin repeat 


6e-54 


192.6 


141C 


ank 


Ank repeat 


1.6e-17 


71.6" 


1412 


Ribosomal_L5 
JC 


ribosomal L5P family C- terminus 


8.2e-58 


205.5 


141S 


trypsin 


Trypsin 


4 .7e-85 


.270.4 


1416 


aminotran 1 


Aminotransferases class-I 


4 ,4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1.6e-C7 


33.1 


1419 


WD4 0 


WD domain, G-beta repeat 


2.2e-05 


44.6 


1422 


cadherin 


Cadherin domain 


8.3e-42 


152.3 


1424 


SH3 


SH3 domain 


2 .Se-80 


280.3 


142S 


PHD 


PHD- finger 


3.2e-17 


70.6 


1426 


PHD 


PHD- finger 


3 .2e-17 


70.6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1426 


helicase__C 


Hel icases conserved C- terminal 
domair. 


le-26 


102.2 


1425 


WD4 0 


VID domain, G-beta repeat 


3 .9e-07 


37.2 


1430 


inositol_P 


Inositol monophosphatase family 


2 .5e-10 


40.2 


1431 


mito_carr 


Mitochondrial carrier proteins 


4 .3e-83 


287.7 


1433 


Clq 


Clg domain 


2 .Se-16 


66 .2 


1434 


WD40 


WD domain, G-beta repeat 


1 .6e-13 


58.3 


143S 


mos-i- 
P__synth 


Myo-inositol-1 -phosphate 
synthase 


7e-228 


770.4 


143 6 


rrm 


RNA recognition motif. 


1 . 4e-34 


lift 1 
lit) . J 


143 6 




Immunoglobulin domain 


1 .3e-12 


45.6 


144 C 


G__Adapt_CT 


Gamma- adapt in, C- terminus 


3 .4e- 67 


236.7 


1441 


G^_Adapt__CT 


Gamma -a dap tin, C- terminus 


3 . 4e-67 


236.7 


144 3 


Kelch 


Kelcn motif 


0 .00013 


28.7 


144 6 


ARID 


ARID DNA binding domain 


1 . 8e-2l 


84.7 


144 7 


zf-C2H2 


Zinc finger, C2H2 type 


9 .4e-28 


105.6 


144 6 


AMP-binding 


AMP-binding enzyme 


2 . 6e-07 


-145- 1 


14 51 


rrm 


RNA recognition motif. 


6 . 5e- 21 


82 . 9 


1454 


*9 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyltransf 


Sialyltransferase family 


5.4e-21 


83 .2 


146C 


Aldose_epim | Aldose 1-epimerase 


1 .9e-35 


131.2 


1461 


C2 j C2 domain 


4e-18 


73.6 


1470 


TIG j IPT/TIG domain 


3.1e-19 


77.3 


14 72 


PseudoUsynt [ RNA pseudouridylate synthase 


4,3e-16 


66.9 
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SEQ ID 
NO : 


PFAM NAME j DESCRIPTION 
1 


p-value 


PFAM 
SCORE 




h_2 | 






1474 


DENN 


DENN (AEX-3) domain 


1 .3e-44 


161 . 6 


1475 


Cation_ef £ lu 

X 


Cation efflux family 


4.6e-49 


176 .4 


1477 


TBC 


TBC coma in 


8 e - 4 7 


169.0 


2478 


rrm 


RNA recognition motif. 


2e-22 


84 .6 


1480 


ic 


Immunoglobulin domain 


5 .Se-06 


24 . 3 


1484 


Telo_bind_ai 
pha 


Telomere -binding protein alpha 
subuni 


0.028 


-225,9 


1485 


Zf -C2H2 


Zinc finger, C2H2 type 


1.8e-68 


240.9 


1486 


pkinase 


Eukaryotic protein kinase 
domain 


9 .5e-13 


49.9 


1486 


helicase_C 


Helicases conserved C* terminal 
domain 


1.4e-l5 


65.2 


1489 


DUF89 


Protein of unknot function 
DUF89 


0.079 


-132 .4 


1490 


ECH 


Enoyl-CoA hydratase/isomerase 
family 


5.2e-41 


149. 7 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5.9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


3 .4e-l9 


77 .2 


1495 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7. le-10 


36 .3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85 . 6 


1500 


SH3 


SH3 domain 


9 .3e-05 


27.2 


1502 


homeobox 


Home ob ox domain 


0.084 


13. G 


1503 


homeobox 


Homeobox domain 


0. 084 


13.8 


1505 


EGF 


EGF- like domain 


2 .7e-23 


90.8 


1506 


UCH-2 


Ubiguitin carboxyl - terminal 
hydrolase family 


2 .7e-21 


84 .2 


1508 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2 .8e-28 


101.8 


1511 


PX 


PX domain 


1 . 9e-ll 


51 . 5 


1512 


Sulfatase j Sulfatase 


2. 8e-3S 


130.7 


1516 


Syntaxin 


Syntaxin 


0.011 


-62 .3 


2518 


aminotrarj_3 


Aminotransf erases class- III 
pyridoxal -pho 


9.7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


0 . 075 


11 . 0 


1521 


RA 


Ras association lRalGDS/AF-6) 
domain 


0 . 013 


13 .3 


1523 


RhoGAP 


RhoGAP domain 


2 .5e-05 


10. 7 


1528 


WD40 


WD domain, G-beta repeat 


5.4e-24 


93 . 1 


1535 


IMS 


impB/mucB/samB family 


7 . 8e-95 


328.5 


1538 


FYVE 


FYVE zinc finger 


3.2e-27 


101.5 


2539 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36 . 5 


1540 


Ocular_alb 


Ocular albinism type 1 protein 


0 


1184 .7 


1653 


SAP 


SAP domain 


6e-06 


33 . 2 


1654 


Amino_oxicas 
e 


Flavin containing amine oxidase 


3 . 2e-43 


157 . 0 


2655 
— _ 


Aminooxidas 
e 


Flavin containing amine oxidase 


3 . 2e-43 


157.0 


1656 


RhoGEF 


RhoGEF domain 


1 . 4e-24 


95.1 


2657 


MMR_HSRJ 


GTPase of unknown function 


0,0011 


-45.5 


lob? 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2. 5e-ll 


51 . 1 


1660 


act in 


Actin 


6 . 6e-21 


C Q Q 

by . y 


1661 


BAH 


BAK domain 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909 - 4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


i.3e-93 


324 .4 


1669 


Noll_Nop2_Su 
n 


N0Ll/NOP2/sun family 


1.3e-23 


84 .3 


1671 


SH2 


Src homology domain 2 


5.4e-l5 


46.9 
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SEQ ID 

NO: 


PFAM NAML 1 
! 


DESCRIPTION 


p-vaiue 


PFAK 
SCORE 


167; 


chromo 


'chromo' ICHRromatin 
Organization Modifier) 


2.ie-18 


67 . 7 


1674 


zf-CCCK 


Zinc finger C-x8-C-x5-C-x3 -H 
type 


0.0025 


17 . i 


1676 


Glyco_hydro_ 
47 


Giycosyl hydrolase family A'> 


1 . 8e-187 


636 .2 


1677 


Glyco_hydrc_ 
47 


Glycosyl hydrolase family 4 7 


4 . 5e- 74 


259 . 5 


168U 


WD40 


WD domain, G-beta repeat 


l-le-27 


105.5 


1681 


W04 0 


WD domain, G-beta repeat 


l .ie-27 


105 . 5 


1683 


HMR_HSRi 


GTFase of unknown function 


1.8e-78 


274 .1 


16S: 


rrm 


RNA recognition motif. 


1 . 8e-37 


137 . 9 


1652 


rrm 


RNA recognition motif. 


1.8e-37 


137 .9 


1693 


AAA 


ATPases associated with various 
cellular act 


1.3e-81 


284 .5 


1637 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8 .4e-82 


285 .2 


1696 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3 .5e-53 


190 . 1 


169S 


zf-C2H2 


Zinc finger, C2H2 type 


4 .4e-34 


126 .6 


17 OC 


arf 


ADP-ribosylation factor family 


9e-15 


75. 6 


1702 


GTPJSFTU 


Elongation factor Tu family 


0 .014 


11 .1 


1703 


SCAN 


SCAN domain 


l.8e-54 


194 .4 


1707 


pkinase 


Eukaryotic protein kinase 
domain 


1.2e-88 


307.9 

i 


1705 


WD4 0 


wd domain, G-beta repeat 


0 . 0035 


24 ,0 ; 


171C 


LRR 


Leucine Rich Repeat 


1.2e-30 


115. 3 ( 


1711 


WW 


WW domain 


7.Ge-12 


52.8 


1712 


ank 


Ank repeat 


4 .2e-34 


126.7 


1713 


zf-CCCK 


Zinc finger C-xB-C-x5-C-x3-K 
type 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-K 
type 


2 .6e-09 


3B.3 


1715 


ras 


Ras family 


4 ,4e-41 


149.9 


1718 


HMG_box 


KMG (high mobility group) box 


B.3e-21 


82.6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


KLH 


Helix-loop- helix DNA-binding 
domain 


9.2e-10 


45.9 


1723 


carm 


Double- stranded RNA binding 
motif 


2 .9e-05 


30.9 


172* 


RrnaAD 


Ribosomal RNA adenine 
dinethylases 


0.045 


9.2 


1725 


C1DE-N 


C1DE-N domain 


5.9e-40 


146.2 


1726 


HAT 


HAT (Half -A-TPR) repeats 


2 .9e-44 


160.5 


1728 


ef hand 


EF hand 


5.1e-20 


79.9 


1733 


Hist_deacety 
1 


Histone deacetylase family 


1 .7e-104 


360.6 


1735 


LRR 


Leucine Rich Repeat 


4 .6e-34 


12§.6 


1739 


PI-PLC-X 


Fhosphat idylinosit ol - speci f i c 
phosphol ipa se 


0 .0023 


16 . 1 


1743 


ras 


Ras family 


3 . 7e-10 


-21 . 3 


1744 


ras 


Ras famiiy 


3 .7e-10 


-21 . 3 


1745 


RasGEF 


RasGEF domain 


3 .2e-49 


176.9 


1746 


adh_short 


short chain dehydrogenase 


7 .le- 08 


Id c 
.J4 . b 


1751 


zf -C2H2 


Zinc finger, C2H2 type 


9e-39 




1754 


fn3 


Fibronectin type III domain 


5 ,5e-101 




1756 


ZZ -C2U2 


Zinc finger, C2H2 type 


6 .3e-93 




1758 


rrm 


RNA recognition motif. 


0.017 




176C 


Nop 


Putative sno.RNA binding domain 


6 . le-95 




1761 


Nop 


Putative snoRNA binding domain 


6 . le-95 


328.8 


J- '"J 


l*ll v IK rloKJ- 


GTPase of unknown function 


D . Sc'tA 


149 . 4 


1769 


CNjiydrolase 


Carbon-nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4 .le-07 


37.1 


1779 


OxysterolBP 


Oxy sterol -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF 


RhoGEF domain 


1.6e-23 


91.6 


1784 


RhoGEF 


RhoGEF domain n 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAM! 


DESCRIPTION 


p-value 


PFAK 
SCORE 


1765 


rrm 


RNA recognition motif. 




55.-7 



TRADOCS: 14 ] 6227. 1 (%CRN0 1 l.DOC) 



264 



BNSDOCID: <WO 0153312A1_I_> 



WO01/533J2 



PCT/US00/34263 



TABLE 5 



SEO ID NO: 


POSITION OF 


MaXS (MAXIMUM 


MeanS I MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






1 


1-2: 


0.991 


0.955 


2 


1-33 


0.995 


0.944 


3 


1-33 


0.945 


0.736 


4 


1-1* 


0.970 " 


0.951 


S 


1-26 


0.971 


0.863 


6 


r 1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0.971 


0.863 


9 


1-46 


0.982 


0.901 


10 


1-21 


0 .991 


0.955 


11 


1-23 


0.989 


0.899 


12 


1-25 


0.955 


0.803 


13 


i-ie 


0.932 


0.625 


14 


l-ie 


0.93 8 


0.876 


15 


1-25 


0.94 2 


0.811 


16 


1-17 


0. 972 


0 .939 


17 


1-27 


0 . 96 4 


0 .777 


18 


1-16 


0 . 914 


0.657 


19 


1 -15 


0 . 953 


0.840 


20 


1 -2C 


0 .935 


0 .701 


21 


1-22 


0 . 974 


0 . 850 


22 


1-33 


0 . 961 


0.895 


23 


1-15 


0 . 991 


0.959 


24 


1-31 


0.99b 


0 . 944 


25 


1-22 


0.976 


0.935 


26 


1-2'? 


0.996 


0 . 928 


27 


1- 24 


0 . 953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-3 : 


0 . 986 


0.841 


30 


1*28 


0.980 


0.893 


31 


1-19 


0 . 993 


0,976 


32 


1-2 2 


0 . 998 


0 . 909 


35 


1-33 


0 . 949 


0 .736 


36 


1-33 


0 . 949 


0 . 736 


46 


1-19 


0. 570 


0 .951 


67 


1-25 


0.968 


0.846 


71 


1-18 


0 . 949 


0 . 845 


72 


1-30 


0.991 


0 . 919 


75 


1-29 


0. 958 


0 . 854 


88 


1-20 


0 .986 


0.945 


94 


1-33 


0 . 994 


0 .943 


97 


1-46 


0.964: 


0.595 


103 


1-49 


0.983 


0 .570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.969 


0.899 


126 


1-25 


0.955 


0.803 


129 


1-1S 


0.962 


0.918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0.628 


148 


1-20 


0.969 


0.904 


156 


1-25 


0.941 


0.811 


158 


1-22 


0.979 


0.927 


160 


1-17 


0.972 


0.939 


161 


1-46 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.94S 


0.825 


180 


1-27 


0 . 981 


0.941 


187 


1-26 


0.982 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.97* 


0.916 


197 


1-22 


0 . 963 


0.936 
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BNSDOCIO: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


POSITION OK 
SIGNAL IN AMINO 
ACID SEQUENCE 


KexS (MAXIMUM 
SCCKE) 


MeanS (MEAN 
SCORE) 

i 


195 


■ 1-20 


C . 93 S 


0.701 


200 


1-23 


0.&77 


0.773 j 


206 


1-30 


0.564 


0.890 


207 


1-19 


C . 99C 


0.924 


20£ 


1-22 


0.974 


0.850 


210 


1-40 


0.940 


0 .670 


211 


1-28 


0.57: 


0.849 ; 


216 


1-24 


0.986 


0.956 | 


21b 


1-33 


0.5*6} 


0.895 


219 


1-19 


0.S7C 


0.871 


221 


1-19 


0 .504 


0.553 j 


222 


1-21 


0.917 


0.555 ! 


230 


1-19 


0 .993 


0.959 : 


231 


1-26 


0 . 952 


0.800 ; 


232 


1-25 


C .586 


0.826 ! 

i 


23$ 


1-23 


0 .969 


0.628 I 


240 


1-17 


0.987 


0.955 


241 


1-1? 


0.587 


0.955 


245 


1-30 


C.57C 


0 . 722 


246 


1 -22 


0.976 


0.935 


249 


1 -23 


0.56J- 


0. 940 


252 


1-18 


0.971 


0.923 


261 


1-24 


0.68 J 


0.587 


265 


1-18 


0 .935 


0 .868 


272 


1-24 


0 .952 


0 .739 


283 


1-21 


0 . 90t 


6.688 


284 


1-29 


0.55'/ 


0 . 854 


290 


1-31 


0 . 98(1 


0 . 84 1 


302 


1-28 


0 . 98C 


0 .893 


304 


1-16 


0 . 90~ 


0.635 


312 


1-19 


0 . 953 


0 . 976 


313 


1-17 


0 . 93 0 


0 .753 


323 


1-22 


Q . 59£ 


0 . 909 


324 


1-17 


0 . 58 7 


0 . 954 


328 


1-19 


0 . 571 


0.665 


329 


1-22 


0.S62 


0.924 


330 


1-33 


0 .576 


0 .641 


331 


1-24 


0. S2C 


0 . 712 


332 


1-24 


0 .575 


0 . 881 


333 


.1-19 


0 . S84 


0 .941 


334 


1-20 


0. €-55 


0.561 


335 


1-27 


0. 542 


0-813 


336 


1-20 


0 . 


0 . 850 


337 


1-38 


0 . 54 2 


0.653 • 


336 


1-27 


0 .573 


0.772 


339 


1-36 


0. 575 


0.804 


340 


1-27 


o.ee? 


0 .597 


343 


1-19 


0.5";: 


0 .865 


344 


1-22 


0 . 95*. 


0.928 


345 


1*17 


0.9~ee 


0.687 


346 


1-19 


0.53C 


0.822 


347 


1-22 


o.se:- 


0.924 


349 


1-24 


0.962 


0.966 


351 


1-21 


0.91fc 


0.815 


352 


1-31 


0.96t 


0 . 912 


254 


1-31 


0.974 


0.839 


355 ~ 1 


1-29 


0.932 


0 .632 


356 


1-15 


0.954 


0.969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.93f 


0 .827 


361 


1-25 


0.954 


0.674 


362 


1-22 


0.925 


0.788 


363 


1-21 


o.ee: 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 


0.976 


0.841 
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BNSDOCID-. <WO 0153312A1J. ? 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


MeanS (ML'AK 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






36b 


1-21 \ 0 . 9 1 t 


0.820 


36'/ 


1-15 . 


C .936 


0.822 


36fc 


1-25 


0.97; 


0.874 


37C 


1-24 


0.92C 


0.712 


37j 


1-24 


0.963 


0.773 


3 72 


1-27 


0.915 


0.768 


373 


1-15 | 0.986 


0.945 


37b 


1-32 


0.994 


0.932 


3 76 


1-34 


0.987 


0.81C 


37'/ 


1-17 


0.99b 


0.950 


378 


1-49 


0.973 


0.745 


38C 


1-20 


0.368 


0.874 


381 


1-2C 


0 .926 


0 .782 


382 


1-19 


0.986 


0.934 


383 


1-26 


0.965 


0.829 


384 


1-39 


0 . 97C 


0 .551 


386 


1-24 


0.575 


0. 881 


386 


1-30 


0 .985 


0 .866 


385 


1-19 


0 .984 


0 .941 


39C 


1-26 


0 . 973 


0.782 


392 


1-20 


0 .983 


0 .900 


393 


l-Z-6 


0 .966 


0.890 


354 


1-23 


0. 937 


0.703 


397 


1-22 


0.985 


0.854 


39$ 


1-46 


0 . 977 


0.69& 


401 


1-20 


0 .895 


0 .567 


402 


1-22 


0 . 967 


0 .931 


403 


1-27 


0 » 992 


0.934 


404 


1-19 


0 . 991 


0.973 


40b 


1-23 


0 . 994 


0.921 


407 


1-3S 


0 .987 


0.656 


408 


1-35 


0 .976 


0.551 


40$ 


1-33 


0 . 897 


0.570 


410 


1-25 


0.99C 


0 .962 


411 


1-36 


0.977 


0.827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0.988 


0.565 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.98: 


0.940 


417 


1-29 


0.941 


0.672 


418 


1-20 


0.952 


o.esc 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.96S 


0.861 


421 


1-22 


0.885 


0.785 


422 


1-46 


0.982 


0.862 


424 


1-19 


0.979 


0.933 


42£ 


1-38 


0.542 


0.653 


430 


1-18 


0.947 


0.595 


432 


1-33 


0.S57 


0 .789 


433 


1-26 


0.S79 


0.904 


434 


1-27 


0.962 


0.777 


435 


1-24 


0.996 


0.577 


436 


1-27 


0.973 


0.772 


443 


1-15 


0.966 


0.94 0 


448 


1-36 


0.275 


0.804 


453 


1-41 


0.9S8 


0.609 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.525 


0.681 


486 


1-27 


0.972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.566 


0.687 


510 


1-23 


0.930 


0.S93 
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BNSDOCID: <WO__0153312A1J_> 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 


MaxS (MAXIMUM 
bCORE f 


MeanS (MEAN 
SCORE ) 


511 


1-21* 


n Qif> 
u . y.} u 




C1 - 
jI/ 


l-z: 


0 °3 0 


0 59 "* 


C 1 £ 
31. 


X - J. o 


0 978 


0 . 956 


- 9 -. 


1-15 


0,936 


0 . 8 2 2 


D - 


J. - C 4. 


0 963 


0 924 


-3S _ 


J. - J. H 




0 96£ 


DDL" 


J- - J Li 


0 933 


0 713 


552 


1- 2 


0 . 973 


0 912 


3D. 




v . 3D 3 


0 7 B c 


CT| 

O / J 


X — £. 1 


0 °1 6 


0 8 1 L 


cn/ 
j / ** 


1 - 3 ~ 


ft. Qfifl 


0 . 912 


cor, 


1 .to 
1"J3 


ft co^l 


0 556 






0 c 7 4 


0 835 


606 


1-2 5 


ft oi •) 


0 63* 


bUr 


1-29 


ft cm 


u . b .? ^ 


biv 




0 . 990 


0.946 


621 


1-15 


0.594 


0.969 


623 


1-33 


0 . 935 


0 . 726 


653 


1-27 


0 . 93 8 


0 . 827 


666 


1 --22 


0 . 929 


0.786 


67"; 


1-16 


0 .948 


0.807 


68b 


1-23 


0.861 


0.71b 


695 


1-22 


0 . 975 


0.816 


702 


1-31 


0 .968 


0 . 898 


707 


1-16 


0 .850 


0 . 561 


713 


1-25 


0 . 966 


0.743 


718 


1-19 


0 .936 


0 . 822 


715 


1-20 


0 .961 


0 . 824 


725 


1-29 


0. 972 


0 . 87*4 


736 


1-46 


0 .903 


0 . 596 


74 e 


1-3 (> 


0.916 


0. 73 0 


74 7 


1-22 


0 . 965 


0 . 876 


746 


1-25 


0 .966 


0 . 765 


755 


1- 24 


0 .961 


0 . 772 


767 


1-27 


0 . 919 


0.768 


768 


1-33 


0 .900 


0.585 


773 


1-42 


0 .959 


0.702 


779 


1-19 


0 . 986 


0 . 94£ 


797 


1-19 


0 .944 


0 .759 


798 


I- IS' 


0 . 900 


0 . 566 


820 


1-17 


0 . 995 


0 . 95C 


827 


1-49 


0 .971 


0 . 745 


84 8 


1-20 


0 . 966 


0 . 874 


864 


1-2C 


0 . 926 


0 . 785' 


Q C £ 
ODD 


1-19 


0 . 986 








ft Q/ C 




API 

vol 


1*4C 




0 825 


a p "7 


1 -5 - 


0 . 970 


U . ->->-! 


J* / 


J. J U 


ft QflC 


ft ttcp 


934 


1-48 "~ 


0 988 


0 . 777 


93 9 


1-3 c 




0 885 


944 


1-26 


n q-ji 

U . 7 /l 


0 782 


"950 


1-29 


0 . 957 


0 845 


963 


1-2C 


0 . 981 


0 . 900 


964 


1-2C 


0.886 


0 558 


973 


1-16 


n qcp 


0 . 890 


980 


1-34 


n qci 

U . ?Dl 


ft 1A 0 


981 


1-20 




0 . 822 


984 


1-12 


ft qi p 


ft 7flO 


1015 


1-22 


0 . 985 


0 . 854 


1040 


1-46 


0.977 


0.696 


1052 


i-ie 


0.969 


0.842 


1059 


1-20 


0.927 


0.867 


1065 


1-33 


0.983 


0.916 


1069 


1-22 


0.993 


0.935 
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BNSOOCID: <WO 0153312A1J_> 



V 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQVEXCE 


MaxS {MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


1075 


1-27 


C.992 


0.934 


1080 


1-15 


C.931 


0 .829 


1092 


1-15 


0.991 


0.973 


1094 


:-4t 


0.992 


0.653 


1095 


1-30 


0.974 


0.929 


1105 


1-23 


0.994 


0 .921 


1123 


1-35 


0.987 j C.658 


1136 


1-32 


0.954 


0 .613 


1140 


1-36 


0.989 


0 .789 


1142 


1-33 


0.897 


0.570 


1152 


1-2S 


0.990 


0.962 


1170 


1-36 


0.977 


0.827 


1176 


1-20 


0.944 


0 .768 


1187 


1-20 


0.988 


0.965 


1189 


1-35 


0.967 


0 .639 


1152 


1-4& 


0.993 


0.636 


1193 


1-lfc 


0.925 


0 .710 


1137 


1-2° 


0.985 


0.653 


1208 


1-23 


0.581 


0.940 


1225 


1-2* 


0.941 


0.672 


1245 


1-15 


0 .566 


0.967 


125e 


1-25 


0 .565 


0.861 


1265 


1-22 


0 .889 


0.785 


1266 


1-2C 


0 .944 


0.809 


1276 


l-4f 


0 .982 


0. 862 


1292 


1-19 


0 .979 


0 . 933 


1296 


1-21 


0 .984 


0 . 944 


1297 


1-15 


0.984 


0. 953 


1332 


1-3F 


0 .942 


0 .653 


1358 


i-ie 


0 .947 


0 .595 


13 71 


1-33 


0 .957 


0 . 789 


1380 


l-2b 


0.979 


0 .904 


1397 


1-27 


0.962 


0.777 


1399 


1-23 


0.997 


0.960 


1404 


2-24 


0.998 


0.977 


1410 


1-15 


0.946 


0.845 


1414 


1-24 


0.913 


0.588 


1415 


1-19 


0.982 


0.925 


1416 


1-12 


0.931 


0.891 


1418 


1-30 


0.933 


0.563 


1420 


l-2(. 


0 . 881 


0.561 


1421 


1-15 


0.990 


0.968 


1423 


1-17 


0.968 


0.863 


1424 


1-21 


0.885 


0.591 


1425 


1-24 


0.913 


0.588 


1426 


1-24 


0.913 


0.588 


1428 


1-25 


0.557 


0.895 


1430 


1-34 


0.977 


0.819 


1431 


1-26 


0.979 


0.923 


1432 


1-36 


0.957 


0.613 


1433 


1-32 


0.921 


0.753 


1434 


1-35 


0,983 


0.621 


1435 


1-25 


0.910 


0.631 


1436 


1-42 


0.988 


0.868 


1437 


1-22 


0.998 


0.980 


1442 


1-20 


0.916 


0.7S3 


1448 


1-12 


0.931 


0.891 


1462 


1-18 


0.968 


0.888 


1490 


1-2C 


0.881 


0.561 


1518 


1-17 


0.968 


0.863 


1525 


1-21 


0.885 


0.591 


1547 


1-2B 


0.974 


0.891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0 .824 


1593 


1-26 


0.979 


0.923 " 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE} 


MeanS (MEAN 
SCORE) 


1596 


1-16 


0.929 { 0.709 


1601 


1-36 


0 .957 


0.613 


1606 


1-2; 


0 . 975 


0 .831 


1607 


1-20 


0 . 97s 


0.770 


1606 


1-32 


0 . 921 


0.753 ; 


1614 


1-33 


0.969 


0.829 


1616 


1-20 


0.955 


0.869 


162S 


1-35 


0.983 


0 .621 


1632 


1-25 


0.910 


0 .631 


1636 


1-33 


0.697 


0 .591 


163S 


1-42 


0.986 


0.868 


1645 


1-20 


0.S27 


0.568 


1647 


1-17 


0.523 


0.742 


1646 


1-22 


0.998 


0.980 



TRADOCS:I4!6234.1(%CR%01!.DOC) 



270 



BNSDOCID: <WO 0153312A1J_> 



WO 01/5331: PCT/USOO/34263 



TABLE 6 



SEQ ID NO: 


SEQ IE 


SEQ ID NO: 


SEQ ID 


Priority i 


SEQ ID 


OI IUJ.A- 


NO : 


of contig 


NC : 


docket number^ 


NO: in 


i.engtn 


IUXi - 


nucleotide 


01 contig 


corresponding 


it c o tc 
u . 5> . i> . r* . 


JJUt-J.CCH.JL UC 


1 encr t h 


sequence 




ern TH NO* in 


nq/dftR 7vC 










sppl i CD t ion 




1 


1787 


3573 


5^55 


784CIP2 1 


11 0,* 


2 


1788 


3574 


536C 


7B4CIP2 2 


2671* 




1789 


3575 


5361 


784CIP2 3 


4117 


i ' 


1790 


3576 


5362 


784CIP2 4 


555( 




1792 


3577 


5363 


784CIP2 5 


5562 




1792 


3578 


5364 


7R4CTP2 6 


5562 




T 7<?< 
x / y j 


3579 


53 65 




5562 


' e 


1794 


icon 


5366 




5562 


q 


1795 


3581 


5367 


7 A APT P7 


55 6 


T5 

1U 


J, / 3D 




^368 


7P ACT D7 1 ft 


c c <r t 
ODDS 


11 


1797 


3583 


5369 


TO/1 /"•TTDT T 1 


556^ 


12 


1798 


3584 




n dapt do 1 *> 


568S 


^ | 


1799 


3585 


5371 


/{H\,lr4 JLj 


5725 


14 


18 00 


3586 


53 72 




574i. 


IS 


1801 


3587 


5373? 


784CIP2 15 


577' 


16 


18 02 


3588 


5374 


784CIP2_16 


5777 


17 


1803 


3589 


537S 


764CIP2_17 


57 89 


18 


1804 


3590 


5376 


784C1P2_18 


5792 


19 


1805 


3591 


5377 


784C1P2 19 


5804 


20 


1 80b 


3592 


537& 


784CIP2 20 


5805 




21 


1807 


3593 


537$ 


784CIP2 21 


580£ 


22 


1808 


3594 


5380 


784CIP2 22 


5844 


23 


2809 


3595 


53£j 


784CIP2_23 


5844 


24 


1820 


3596 


5382 


784CIP2_24 


5850 


25 


1813 


3597 


536 3 


784CIP2_25 


5867 


26 


1812 


3598 


5384 


784CIP2 26 


5972 


27 


1813 


3599 


5385 


784CIP2^27 


5995 


28 


1814 


3600 


5366 


784CIP2_28 


5995 


29 


181b 


3601 


5387 


784CIP2^29 


6005 


30 


1816 


3602 


538 6 


784C2P2_30 


6007 


31 


1817' 


3603 


536 9 


7B4CIP2^31 


6007 | 


32 


1B1B 


3604 


53S 0 


7B4CIP2^32 


6009 


33 


1 819 


3605 


5353 


/o4t— L<r2 33 


6022 


34 


1820 


3606 


53 92 


784CIP2 34 


6015 


3 5 


2822 


3607 


5351* 


/U4LIr2 35 


6026 


3 6 


1622 


3608 


5354 




6016 


1 "7 

^ / 


1 823 


3 609 


5395 


/ OH Li r£ J / 


6018 


i p 

J o 


1 824 


3610 


53 £ t 


1 RA t^l DO 1 Q 


6018 


3 9 


1825 


3 611 


53 97 




6018 


40 


1826 


3612 


53 9b 


7RAPTD7 Aft 


6023 


41 


1827 


3613 


5355 




6070 


42 


182£ 


3614 


54 00 


7B4C1P2 42 


608: 


43 


1829 


3615 


54 02 


784CIP2 43 


6089 


44 


2830 


3616 


5402 


784CIP2 44 


6llt 


45 


1832 


3617 


54 03 


784CIP2 45 


6116 


46 


1832 


3618 


54 04 


784CIP2 46 


613G 


47 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2 48 


6189 


49 


1835 


3621 


5407 


784C1P2 49 


6191 


50 


1836 


3622 


5406 


784CZP2_50 


6204 


51 


1837 


3623 


5405 


784C1P2 51 - 


6204 


52 


1836 


3624 


541C 


784CIP2_52 


6284 


53 


1839 


3625 


5412 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


6436 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 | 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


784C1P2_58 


6458 


59 


1845 


3631 


5417 


784CIP2_59 


6458 
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BNSDOC1D: <WO 0153312A1J_> 



v WO 01/5331? PCT/US00/34263 

/ 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


Oi luli - 


NO : of 


of contio 


NO 


docket number 


NO: in 




ful] - 




ot conticj 


c vj l i c ci pen a .1 nc 


11 C C N 


nucleotide 


1 c 119 t h 




nprit i rip 


SEQ ID NO: in 


09/48A 775 


secuence 


peDti de 
seguence 




se0uence 


priori ty 
application 




60 


1846 


3632 


5416 


784CIP2 60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1846 


3634 


542C 


764CIP2 62 


6495 


63 


1S49 


363£ 


54 21 


784CXP2 6* 


6495 


64 


1850 


3636 


5422 


784CIP2_64 


6505 


65 


1051 


3637 


5423 


784C1P2 6f: 


6534 


66 


1852 


3638 


5424 


784CIP2 6f 


6534 


67 


1853 


3635 


5425 


784CIP2 6; 


6540 


66 


1854 


3640 


5426 


784CIP2 6b 


6550 


69 


1855 


3641 


5427 


784C1P2 65 


6550 


70 


1856 


3642 


5428 


784C1P2 7C 


6592 


71 


1857 


3643 


5429 


784CIP2 73 


6645 


75 


1958 


3644 


5430 


784CJP2 72 


6671 


73 


1359 


3645 


5431 


784CIP2 73 


6763 


74 


1860 


3646 


54 32 


7 8.4 CI PV 74 


6763 


75 


1361 


3647 


5433 


7B4C1P2 7 L 


6786 


76 


1862 


364 6 


54 3 4 


784CIP? 7f> 


6824 


77 


1663 


3645 


543 5 


1 <5*i V» J r £. 1 1 


6830 


78 


1864 


3650 


54 36 


1 OILiri /D 


6831 


7 c 


186S 


3653 


54 3 7 




6 832 


80 


1866 


" 3652" """" 


543 8 




6834 


81 


1867 


36S3 


54 3 9 


7fi4fTP , 7 fl" 


6834 


82 


1868 


3654 


54 4 0 




683 b 


83 


285£ 




5441 




6837 


84 


1870 


3 656 


544 2 


7A4CTP7 04 


6843 


85 


lo / J. 




CA A 1 


TDiPTDO fit. 




86 


1872 


J DDO 


54 4 ^ 


/ 0 *i I- -L r ot 


CQ1 C 
O ? J- - 


87 


18 73 


3655 


c:44 c. 


"7fi/ir'7t>"7 P7 


693 2 


8 8 


1874 


3 660 


544 6 


/Ofi^J.x'^ DC 


6957 


8 9 




J DO J 








90 


1876 


3 662 


544 8 




6973 


91 


1877 


J DO J 




TfiAPlPO Q7 


6973 


92 


18 78 


3 664 


54 50 


t O V JL if 3 


700 7 


93 


1879 


3 665 


54 51 




7018 


94 


1880 


3 666 






7019 


95 


1881 


3 667 


54 5"* 


7H4CTP"? Qf> 


7020 


96 


18 82 


3668 


5454 




7020 


97 


18 83 


3669 


54 55 


1 04^1rZ -* C 


7021 


9tj 


1884 


3 670 


5456 




7023 


95 


1885 


3671 


5457 


784CIP9 • lOf) 


7027 


100 


1686 


3672 


5458 


7B.4CIP? 103 


7028 


101 


1887 


3673 


5459 




7029 


102 


1888 


3674 


54 60 




7031 


103 


1889 


3675 


5461 


784CIP2 104 


7032 


104 


1890 


3676 


5462 


784CIP2 105 


7033 


105 


1891 


3677 


5463 


784CIP2 106 


7035 


106 


1892 


3678 


5464 


784C1P2 107 


7036 


107 


1893 


3675 


5465 


784CIP2 106 


7039 


108 


1894 


3680 


5466 


784CIP2 102 


7043 


109 


ie95 


3 681 


5467 


784CIP2 110 


7044 


110 


1896 


3682 


5468 


784CIP2 111 


7046 


111 


1897 


3683 


5469 


784CIP2_112 


7054 


115 


1898 


3684 i 


54 70 


784CIP2 113 


7061 


112 


1899 


3685 


5471 


784CIP2 114 


7077 


114 


1900 


3686 


5472 


784CIP2 115 


7092 


115 


1901 


3687 


5473 


784C1P2 116 


7094 


116 


1902 


3686 


54 74 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


784C1P2_118 


7107 


118 


1904 


3690 


5476 


784CIP2_119 


7111 


119 


1905 


3691 


5477 


784CIP2 120 


7123 


120 


1906 


3692 


5478 


7B4CIP2121 


7142 


121 


1907 


3693 


5479 


784CIP2_122 


7142 
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BNSDOCID: <WO 0153312A1_I_> 



0 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NC: | 


SEQ ID 


Priority I 


SEQ ID 


of full- 


NO: cf 


of contic 


NO: 


docket number_ 1 


NO: in 


length 


full- 


r.-ucleoirice 


cf contig 


corresponding } U.S.S.K. 


nucleotide 


length 


sequence. 


peptide 


SEQ ID NO: in \ 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application \ 




322 j 


1908 


3694 


54 8 C 


784CIP2_l23 t 


7154 


223 


190S 


3695 


5481 


784CIP2_124 


7160 


124 


1910 


3696 


54 82 


784CIP2_125 


7169 


125 


1911 


3697 


5483 


784CIP2 126 


7185 


126 


1912 


369E 


54 84 


784CIP2 127 


7197 


127 


1913 


3695 


54 85 


704CIP2_128 


7219 


126 


1914 


3 7CC 


5486 


784CIP2_i29 


7226 


125 


1915 


3701 


5487 


784CIP2 130 


7229 


13G 


1916 


3702 


5488 


784CIP2 131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


5490 


784CIP2_133 


7235 


333 


1919 


3705 


5491 


784CIP2_234 


7238 


134 


1920 


3706 


5492 


784CIP2_13S 


7247 


135 


1921 


3707 


5493 


784CIP2_ 236 


7261 


13fc 


1922 


370£ 


5494 


784CIP2 137 


7262 


127 


1923 


3705 


5495 


784CIP2_138 


7267 


136 


1924 


3710 


5496 


784CIP2_139 


7272 


139 


1925 


3711 


5497 


784CIP2_140 


7273 


140 


1926 


3712 


5498 . 


784CIP2_141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


1928 


3714 


5500 


784CJP2_143 


7291 


143 


1929 


3715 


5501 


784CIP2144 


7293 


144 


1930 


3716 


5502 


784CIP2_145 


7294 


145 


1931 


3717 


5503 


764CIP2_246 


7299 


246 


1932 


3728 


5504 


784CIP2_14 7 


7300 


147 


1933 


3719 


5505 


784CIP2_146 


7312 


346 


1934 


372C 


5506 


784CIP2 149 


7313 


145 


1935 


3721 


5507 


784CIP2150 


7315 


ISO 


1936 


3722 


55C8 


784CIP2_15i 


7318 


153 


1937 


3723 


55C9 


784CIP2^152 


7321 


152 


1938 


3724 


5510 


784CIP2153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2_155 


7333 


155 


1941 


3727 


5513 


784CIP2_1S6 


7350 


ISfc 


1942 


3728 


5514 


784CIP2 157 


7352 


157 


1943 


3729 


5515 


7Q4CIP2_156 


7384 


ISfc 


1944 


3730 


5516 


784CIP2_15? 


7403 


155 


1945 


3732 


5517 


784CIP2_16L' 


7431 


260 


, 1946 


3732 


5516 


784CIP2_161 


7441 


161 


1947 


3733 


5519 


784CIP2_162 


7453 


162 


1948 


3734 


552 C 


784CIP2_163 


7467 


163 


194S 


3735 


5521 


784CIP2_3fe4 


7471 


164 


1950 


3736 


5522 


784CIP2 165 


7493 


165 


1951 


3737 


5523 


784CIP2_166 


' 7502 


16t 


1952 


3736 


5524 


784CIP2167 


7511 


167 


1953 


3735 


5525 


784CIP2_166 


7514 


iee 


1954 


374C 


5526 


784CIP2_16S 


7520 


165 


1955 


3741 


5527 


784CIP2 170 


7541 


170 


1956 


3742 


5528 


784CIP2_171 


7570 


171 


1957 


3743 


5529 


784CIP2_172 


7578 


172 


1958 


3744 


5530 


784CIP2 172 


7583 


173 


1959 


3745 


5531 


784CIP2__174 


7592 


174 


1960 


3746 


5532 


784CIP2_175 


7601 


175 


1962 


3747 


5533 


/ohLIFz 1 fc 


/ O \JZ. 


176 


1962 


3748 


5534 


7 84 CI P2_l 7 7 


7608 


177 


1963 


3749 


5535 


784CIP2__178 


7615 


176 


1964 


3750 


5536 


784C1P2_179 


7617 


179 


1565 


3751 


5537 


784CIP2_181 


7624 


180 


1966 


3752 


5538 


784CIP2 182 


7626 


161 


1367 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


7B4CIP2 184 


7641 


183 


1969 


3755 


5541 


784CIP2_18S 


7641 
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BNSDOCIO: <WO 0153312A1_I_> 



WO 01/55312 



PCT/US00/34263 



SEC ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


.SEQ IE- 


of fulJ- 


NO: of 


of ccncic 


NO: 


docket number^ 


NO: in 


lencth 


full- 


nucleotide 


cf contig 


corresponding 


U.S. S.N. 


nucJeotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
seouence 




sequence 


priority 
application 




184 


1970 


37 56 


5542 


7 84CIP2_186 


7641 


185 


1971 


3757 


5543 


784CIP2_187 


7642 


286 


1972 


3756 


5544 


784CIP2_ 188 


7649 


187 


1973 


3759 


5545 


7 84CIP2_1 89 


7656 


188 


1974 


3760 


5546 


784CIP219D 


7657 


ICS 


1975 


376 1 


£547 


784CIP2_191 


7657 


19C 


157$ 


3762 


554 8 


784CIP2_192 


7662 


191 


1977 


37 63 


5549 


784CIP2_193 


7668 


192 


1978 


3764 


5550 


784CIP2_194 


7673 


193 


1979 


316b 


5551 


784CIP2_195 


7690 


194 


1980 


3766 


5552 


784CIP2_196 


7700 


195 


1981 


3767 


5553 


784CIP2_197 


7709 


196 


1982 


3768 


5554 


784CIP2_198 , 


7736 


197 


1583 


376S 


5555 


784CIP2_199 


7737 


198 


1984 


377C 


5556 


784CIP2_200 


7744 


199 


198S 


3773 


5557 


784CIP2 201 


7771 


200 


1986 


3 772 


5558 


784CIP2 202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


1989 


3771, 


5561 


784CIP2_205 


7806 


204 


1990 


3776 


5562 


784CIP2_206 


7812 


205 


1991 


377 7 


5563 


784CIP2_207 


7812 


2C6 


1992 


3776 


5564 


784CIP2_206 


7818 


207 


1993 


377S 


5565 


784CIP2_205 


7822 


208 


1994 


3780 


5566 


784CIP2_210 


7827 


209 


1995 


3782 


5567 


784CIP2J211 


7830 


210 


1996 


376> 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 


212 


1998 


3784 


5570 


784CIP2_215 


7858 


213 


1999 


378t 


5571 


784CIP2 216 


7856 


214 


2000 


376C 


5572 


784CIP2_217 


7861 


215 


2001 


3767 


5573 


784CIP2_218 


7866 


216 


2002 


3786 


5574 


784CIP2_219 


7868 


217 


2003 


3 785 


5575 


784CIP2_220 


7896 


216 


2004 


379C 


5576 


784CIP2_221 


7898 


219 


2005 


3751 


5577 


784CIP2_222 


7900 


220 


2006 


3752 


5578 


784CIP2_223 


7906 


223 


2007 


379Z- 


5579 


784CIP2_224 


7908 


222 


2008 


3754 


5580 


764CIP2_225 


7909 


223 


2009 


3755 


5581 


784CIP2_226 


7917 


224 


2010 


3756 


5582 


784CIP2_227 


7932 


225 


2011 


. 3797 


5583 


784CIP2_228 


7940 


226 


2012 


379P 


5584 


784CIP2_225 


7940 


227 


2023 


3755 


5585 


784CIP2_230 


7984 


228 


2014 


3800 


5586 


784CIP2_231 


7984 


229 


2015 


3 8C1 


5587 


784CIP2232 


8001 


230 


2016 


3802 


5588 


784CIP2233 


8021 


231 


2017 


3803 


5589 


784CIP2_ 234 


8029 


232 


2018 


3804 


559C 


7B4CIP2__235 


8033 


233 


2019 


38CL- 


5591 


784CIP2_236 


8040 ; 


234 


2020 


3806 


5S52 


784CIP2_237 


8052 


235 


2021 


3807 


5593 


784CIP2 238 


8096 


236 


2022 


3808 


5594 


784C1P2_239 


8096 


237 


2023 


3805 


5595 


784CIP2_240 


8113 


238 


2024 


3810 


5596 


784CIP2 241 


8126 


535 


2025 


3811 


5597 


784CIP2_242 


8132 


24 0 


2026 


3812 


5598 


784CIP2_243 


8137 


241 


2027 


3813 


5599 


784CIP2J244 


8137 


242 


2028 


3814 


5600 


784CIP2 245 


8159 


243 


2029 


3815 


5S01 


784C1P2_246 


8159 


244 


2030 


3816 


5602 


784CIP2_247 


8161 


245 


2031 


3817 


5603 


784CIP2_24 8 j 


8176 
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BNSDOCID: <WO Q153312A1 J_s 



i 



WO 01/53312 



PCT/US00/34263 





SEO ID 




550 ID 


Priori c y 


SEO ID 


of full- 


NO; of 


ot contic 


NO : 


docket number 


N0:in 


1 ength 


full- 


nucleotide 


ot contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEO ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




! 246 


2032 


3816 


5604 


784CIP2_249 


6196 


247 


203; 


3819 


5605 


784CIP2_2S0 


6200 


248 


2034 


3820 


5606 


784C1P2__251 


6212 


249 


203S 


3623 


5607 


784CIP2_252 


6220 


! 250 


2036 


3822 


5608 


784CIP2_253 j 


6236 


; 251 


2037 


3822 


5609 


784CIP2_2S4 


8254 


252 


203b 


3824 


5610 


784CIP2_2S5 


6255 


253 


2039 


3825 


5611 


784CIP2_256 


8288 


254 


2040 


3826 


5612 


784CIP2_257 


6296 


! 255 


2041 


3827 


5613 


784CIP2_258 


8329 


| 256 


2042 


3828 


5614 


784CIP2_2S9 


6362 


, 257 


204 3 


3B2S 


5615 


784CIP2_260 


6425 


258 


2044 


3830 


5616 


784CIP2_261 


6436 ' | 


259 


204 b 


3831 


5617 


784CIP2 262 


6448 [ 


260 


204c 


383? 


5618 


784CIP2 w _263 


84 72 j 


261 


2047 


3833 


5619 


784CIP2__264 


6502 


262 


204c 


3834 


5620 


784CIP2_265 


85 04 


263 


204S 


3835 


5621 


784CIP2_266 


€507 


j 264 


2050 


3836 


5622 


784CIP2 268 


8509 


265 


2051 


3837 


5623 


784CIP2_269 


8515 


266 


20S* 


383e 


5624 


784CIP2_270 


8519 


267 


2053 


3835 


5625 


784CIP2_271 


8530 


268 


2054 


3B4C 


5626 


764CIP2 272 


8532 


269 


2055 


3841 


5627 


784CIP2_273 


1 8532 


270 


2056 


3842 


5628 


784CIP2 274 


8539 


271 


2057 


3843 


5629 


784CIP2_275 


8541 


272 


2056 


3844 


5630 


784CIP2 276 


854 3 


273 


205S 


384S 


5631 


784CIP2 277 


8593 


274 


206C 


3846 


5632 


784CIP2 278 


8595 


275 


2063 


3847 


5633 


764CIP2 279 


8635 


276 


2062 


3848 


5634 


784CIP2_280 


8620 


277 


2063 


3849 


5635 


784CIP2_281 


8621 


278 


2064 


3850 


5636 


784CIP2_282 


8623 


279 


2065 


3851 


5637 


784CIP2_283 


8625 


280 


2066 


3852 


5638 


784CIP2_284 


8628 


281 


2067 


3853 


5639 


784CIP2_285 


8628 


282 


2068 


3854 


5640 


784CIP2_286 


8629 


283 


2069 


3855 


5641 


784CIP2 287 


863C 


284 


2070 


3856 


5642 


784CIP2_288 


8621 


285 


2071 


3857 


5643 


784CIP2_289 


8633 


286 


2072 


3858 


S644 


784CIP2_290 


8634 


287 


2073 


3859 


5645 


784CIP2_291 


8625 


288 


2074 


3860 


5646 


784CIP2_292 


8636 


289 


2075 


j 3861 


5647 


784CIP2293 


8659 


290 


2076 


{ 3862 


5648 


784CIP2_294 


8660 


291 


2077 


3863 


5649 


784CIP2_295 


B667 


292 


2076 


3864 


5650 


784CIP2J296 


8667 


293 


2079 


3865 


5651 


784CIP2_297 


8685 


294 


2080 


3866 


5652 


784CIP2J298 


8 805 


295 


2061 


3867 


5653 


784CIP2_299 


8896 


296 


2oe2 


3868 


5654 


784CIP2_300 


8976 


297 


2063 


3869 


5655 


784CIP2_302 


9046 


298 


2064 


3870 


5656 


784CIP2_302 


! 9048 


299 


2085 


3871 


5657 


784CIP2_303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


9195 


301 


2067 


3873 


5659 


784CIP2_305 


9201 


302 


208fc 


3874 


5660 


784CIP2_306 


9307 


303 


2089 


3875 


5661 


784CIP2 307 


9321 


j 304 


2090 


3876 


5662 


784CIP2 308 


9397 


i 305 


2091 


2877 


5663 


784CIP2_309 


9405 


306 


2092 


3878 


5664 


784C1P2_310 


9406 


j 307 


2093 


3879 


5665 


784C1P2 311 


9422 
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BNSDOCID: <WO 0153312A1 J_s 



WO 01/53312 
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SEQ ID NO: 


SEQ it 


SEQ ID NO: 


seo ir | 


Priority 


seq it; 


of full- 


NO: of 


of contig 


NC: 1 


docket number_ 


NO: in 


length 


full- 


nucleotide 


ot contic j 


corresponding 


U.S.S.N. 


nucleotide 


length ' 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 


i 


306 


2094 


3880 


5666 


784CIP2_312 


9494 


309 


209b , 


3861 


5667 : 


784CIP2_313 


9512 


310 


2096 


3882 


566E 


784CIP2 314 


9632 


311 


2097 


3883 


566 9 ! 


784CIP2_315 


966: ( 


312 j 


2096 


3884 


5670 i 


784CIP2_316 


9664 j 


313 } 


2099 


3B65 


5671 


784CIP2__317 i 


9693 | 


314 


2100 


3886 


567i * 


■ 784CIP2 318 


9700 j 


315 


2101 


3887 


5673 


784CIP2 319 


971L 


316 


2102 


3888 


5674 


784CIP2_320 


972S ' 


317 


2103 


3889 


567^ 


7B4CIP2 321 


S870 


318 


2104 


3890 


567t 


784CIP2_322 


9887 j 


319 


2105 


3891 


5677 


784CIP2_323 


9923 { 


320 


2106 


3892 


5678 


784CIP2_324 


9938 


321 


2107 


3893 


5675* 


784CIP2 325 


9964 


322 


2106 


3894 


5680 


784CIP2_326 


10007 


323 


2109 


3895 


5683 


784CIP2 327 


10009 


324 


2110 


3896 


568i 


764CIP2_328 


10046 [ 


325 


2113 


3897 


568? 


784CIP2__329 


10156 


326 


2112 


3898 


5684 


784CIP2 330 


10276 i 


327 


2113 


3899 


568S 


784C1P2 331 


10283 ; 


328 


2114 


3900 


568fc 


784CIP2B_1 


152 


329 


2115 


3901 


5687 


784CIP2B_2 


167 


330 


2116 


3902 


568f 


784CIP2B_3 


205 


331 


2117 


3903 


568^ 


784CIP2B_4 


210 


332 


2iie 


3904 


5690 


784CIP2B_5 


225 


333 


2119 


3905 


5693 


784CIP2B_6 


226 


334 


2120 


3906 


5692 


784CZP2B_/7 


264 


335 


2121 


3907 


5693 


784CIP2B 8 


266 


336 


2122 


3908 


5694 


784CIP2B_9 


293 


337 


2123 


3909 


569!: 


784CIP2B10 


293 


338 


2229 


3910 


5696 


784CIP2B_11 


295 


339 


2125 


3911 


5697 


784CIP2B_12 


302 


340 


2126 


3912 


5698 


784CIP2B_13 


313 


341 


2127 


3913 


569S 


J 784CIP2B 14 


352 


342 


2126 


3914 


570C 


784CI?2B_15 


356 


343 


2129 


3915 


5703 


784CIP2B_16 


368 


344 


2130 


3916 


5702 


784CIP2B_17 


393 


345 


2131 


3917 


5703 


784CIP2BJL8 


477 


346 


2132 


3918 


5704 


784CIP2B_19 


508 i 


347 


2133 


3919 


5705 


784CIP2B_20 ' 


506 


348 


2134 


3920 


5706 


784CIP2B_2i 


515 


349 


2135 


3921 


5707 


784CIP2B_22 


578 


350 


2136 


3922 


5706 


784CIP2B_23 


586 


351 


2137 


3923 


5709 


784CIP2B_24 


592 


352 


2138 


3924 


5710 


784CIP2B_25 


593 


353 


2139 


3925 


5711 


784CIP2B_26 


594 


354 


2140 


3926 


5712 


784CIP2B_27 


619 


355 


2141 


3927 


5713 


784CIP2B_26 


620 


356 


2142 


3928 


5714 


784CIP2B_29 


654 


357 


2143 


3929 


5715 


784CIP2B_30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


3 59 


2145 


3931 


5717 


784CIP2B_32 


758 


360 


214 6 


3932 


5718 


784C1P2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


784CIP2BJ35 


638 


363 


2149 


3935 


5723 


784CIP2B w 36 


870 


364 


2150 


3936 


5722 


784CIP2BJJ7 


891 


365 


2152 


3937 


5723 


784CIP2B 38 


891 


366 


2152 


3938 


5724 


784CIP2B 3S 


922 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 
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BNSDOC1D: <WO i 



0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: ol 


of contig 


NO: 


docket number__ 


NOi in 


length 


fulj 


nucleotide 


cf contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




370 


2l5f 


3942 


5726 


784CIP2B_43 


956 


371 


215'/ 


3943 


5729 


784CIP2B_44 


968 


372 


2156 


3944 


5730 


784CIP2B 45 


992 


373 


2159 


3S4S 


5731 


784CIP2B_46 


1025 


374 


216C 


3946 


5732 


784CIP2B_ 47 


1074 


375 


2163 


3947 


5733 


784CIP2B_48 


1104 


376 


2162 


3948 


5734 


784CIP2B_49 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 


784CIP2B_51 


1262 


379 


2165 


3951 


5737 


784CIP2B_52 


1318 


380 


2iee 


3952 


5736 


784CIP2B_53 


1315 


381 


2167 


3953 


5739 


784CIP2B_54 


1326 


382 


2166 


3954 


5740 


784CIP2B__55 


1436 


383 


216? 


3 955 


5742 


784CIP2B__56 


1464 


384 


2170 


3956 


5742 


784CIP2B_57 


1584 


385 


2173 


3957 


574 3 


784CIP2B_58 


1617 


386 


2172 


3956 


5744 


784CrP2B_59 


1724 ■: 


387 


2173 


3959 


5745 


784CIP2B_60 


1728 


388 


2174 


3960 


5746 


784CIP2B_61 


1772 


389 


2175 


3961 


5747 


784CIP2B_62 


180S 


390 


2176 


3962 


5748 


784CIP2B 63 


1868 


391 


2177 


3963 


5749 


764CIP2B_64 


1896 


392 


2176 


3964 


5750 


784CIP2B_65 


1926 


393 


2179 


3965 


5751 


784CIP2B_66 


1965 


394 


2181; 


3966 


5752 


784CIP2B 67 


1967 


39S 


2181 


3 967 


5753 


784CIP2B_68 


199b 


396 


2182 


3968 


5754 


784CrP2B_69 


2005. 


397 


218 3 


3969 


5755 


784CIP2B 70 


2027 


398 


2184 


3970 


5756 


784CIP2BJ71 


2056 


3 99 


218S 


3971 


5757 


784CIP2B 72 


2103 


400 


2186 


3972 


5758 


784CIP2B_73 


2106 


401 


2187 


3973 


5759 


784CIP2B 74 


2166 


4 02 


2186 


3974 


5760 


784CIP2B_1& 


2175 


403 


2189 


3 975 


5 761 


784CIP2B 76 


2176 


4 04 


219C 


3976 


5762 


784CIP2B_78 


2236 


405 


219i 


3977 


5763 


784CIP2B 79 


2250 


406 


2192 


3978 


5764 


784CIP2B HO 


2300 


4 07 


2193 


3979 


• 5765 


784C1P2B_81 


2323 


408 


2194 


3980 


5766 


784CIP2B_82 


2340 


409 


2196 


3981 


5767 


784CIP2B_83 


2371 


410 


21 31 


3982 


5768 


784CIP2B_84 


2399 


411 


2197 


3983 


5769 


784CIP2B_85 


2411 


412 


2196 


3984 


5770 


784CIP2B_86 


2428 


413 


2195- 


3985 


5771 


784CIP2B_87 


2430 


414 


220C 


3986 


5772 


784CIP2B_B8 


243S 


415 


2202 


3987 


5773 


784CIP2B_89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 


784C3P2B_91 


2487 


418 


2204 


3990 


5776 


784CIP2B_92 


2492 


419 


2205 


3991 


5777 


764C1P2B_93 


2512 


| 420 


2206 


3992 


577e 


784CIP2B_94 


2564 


421 


2207 


3993 


5779 


784CIP2B_95 


2678 


422 


2206 


3994 


5780 


784CIP2B_96 


2816 


423 • 


2205 


3995 


5781 


784CIP2B_97 


2818 


424 


2210 


3996 


5782 


784CIP2B_98 


2819 


425 


2213 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CIP2B 100 


3137 


427 


223 


3999 


5785 


784CIP2B 101 


3137 


428 


2214 


4000 


5786 


784CIP2B_102 


3160 


429 


2211 


4001 


5787 


784CIP2B_103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


3360 


431 


221'/ 


4003 


5789 


784CIP2B 105 


3362 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



rCT/i:S00/34263 



SEO ID NO: 


SEO ID 


SEO ID NC: 


SEO ID 


Priority 


SEO It 


of full- 


NO: Of 


of contic 


NO: 


docket number 


NO: in 


lenoth 


full- 


nucleotide 


cl contig 


corresponding 


U.S.S.K 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




4 32 


2218 


4 004 


5790 i 


784CIP2B_10fc 


341'i 


432 


22X5 


4005 


5791 


784CIP2B_107 


34 If; 


434 


2220 


4006 


5792 


764CIP2B_108 


344; 


43S 


2221 


4007 


5793 


784CIP2B_109 


344: 


436 


2222 


4008 


5794 


7&4CIP2B_110 


344 4 


437 


2223 


4009 


57 95 


784CIP2B_111 


385' 


438 


2224 


4010 


5796 


784CIP2B_112 


386- 


439 


2225 


4011 


5797 


784CIP2B 113 


409( 


440 


2226 


4012 


5798 


784C1P2B_114 


410: 


441 


2227 . 


4013 


5799 


784CIP2B_115 


4141 


442 


2228 


4014 


E80C 


784CIP2B_116 


414: 


443 


2229 


4015 


5801 


784CIP2B_117 


4145 


444 


2230 


4016 


5802 


7B4CIP2BJL1B 


419t 


445 


2231 


4017 


5S03 


784CIP2B_119 


420? 


446 


2232 


4018 


5804 


784CIP2B_120 


427c 


447 


2233 


4015 


5805 


784CIP2B_121 


430<; 


448 


2234 


4020 


5806 


784CIP2B_122 


430( 


_._ 

449 


2235 


4021 


5607 


784CIP2B_123 


431u 


450 


2236 


4022 


5808 


784CIP2B_124 


432 • 


451 


2237 


4023 


5809 


784CIP2B_125 


4321- 


452 


2238 


4024 


5810 


784CIP2B_126 


4332 


453 


2239 


4025 


5811 


784CIP2B_127 


448e 


454 


2240 


4026 


5812 


784CIP2B_128 


458e 


455 


2241 


4 027 


5813 


784CIP2B_129 


556^ 


456 


2242 


4028 


5814 


7S4CIP2BJ130 


557^ 


457 


2243 


4029 


5815 


784CIP2BJL31 


5577 


458 


2244 


4030 


5816 


784CIP2B^132 


557^ 


459 


2245 


4031 


5817 


7B4CIP2B_133 


sse; 


460 


2246 


4032 


5818 


784CIP2B_134 


5S8> 


463 


2247 


4033 


S819 


784ClP2B_l3b 


5584 


462 


2248 


4 034 


5820 


784CIP2B 136 




463 


2249 


4035 


5821 


784C1P2B_137 


5593 


464 


2250 


4036 


5822 


784C1P2B_13E 


5593 


465 


2251 


4037 


5823 


784CIP2B 139 


5594 


466 


2252 


4038 


582 4 


784CIP2B 140 


5594 


467 


2253 


4039 


5825 


784C1P2B141 


5596 


468 


2254 


4040 


5826 


784CIP2B 142 


5602 


46S 


2255 


4041 


5827 


784C1P2B_14 2 


5605 


470 


2256 


4042 


5828 


784CIP2B_144 


5608 


471 


2257 


4043 


5829 


784CIP2B_145 


5617 


472 


2258 


4044 


5830 


784CIP2B_146 


5620 


473 


2259 


4045 


5831 


784CIP2B_147 


5622 


4 74 


2260 


4046 


5832 


784CIP2B_14 8 


5623 


475 


2261 


4 04 7 


5833 


784CIP2B_149 


5624 


476 


2262 


404e 


5834 


784CIP2B_150 


5625 


477 


2263 


4049 


5835 


784CIP2B_151 


5627 


476 


2264 


4050 


5836 


784CIP2B_152 


5628 


475 


2265 


4 05 j 


5837 


784CIP2B_153 


5630 


48C 


2266 


4052 


5836 


784CIP2B 154 


5632 


481 


2267 


4053 


5839 


784CIP2B_155 


56 4 C 


482 


2268 


4054 


5840 


784CIP2B_156 


5643 


483 


2269 


4055 


5841 


1 784CIP2B_157 


5643 


484 


2270 


4056 


5842 


784C1P2B 158 


5647 


485 


2271 


4057 


5843 


784CIP2B_159 


5649 


486 


2272 


40S8 


5844 


784CIP2B_160 


565E 


467 


2273 


4059 


5845 


784CIP2B_161 


5659 


48 8 


2274 


4060 


5846 


784CIP2B_162 


5667 


485 


2275 


4061 


5847 


784CIP2B_X63 


5672 


490 


2276 


4062 


5848 


784C1P2B_164 


5674 


,4 93 


2277 


t 4063 


5845 


784CIP2B_165 


5678 


492 


2278 


4064 


5B50 


784CIP2B 166 


5680 


493 


2279 


4065 


5351 


784CIP2B_167 


5684 



278 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO : 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contic 


NO: 


docket number^ 


NO:ir; 


length 


full- 


nucleotide 


of contig 


correspondinc 


U.S. S.N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NOt in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




4 5'- 


228C 


4066 


5852 


784CIP2B_16£ ' 


5686 


495 


2281 


4067 


5853 


784CIP2B__16S j 


5694 


49t 


2282 


4068 


5854 


784CIP2B_170 


5698 


4 9"/ 


2283 


4069 


" ' 585S 


784C1P2B_1?2 


5699 


49t 


2284 


4070 


5856 


784CIP2BJL72 


5712 


4 9S- 


2285 


4071 


5857 


784CIP2B_172 


5719 


500 


2286 


4072 


5858 


784CIP2BJL74 


5720 


501 


2287 


4073 


5859 


784C1P2B 17b 


5727 


sc; 


2288 


4074 


5860 


784CIP2B_I7£ 


5730 


503 


2289 


4075 


5861 


7B4CIP2BJL77 


5734 


504 


2290 


4076 


5862 


784CIP2BJL7E 


5738 


50b 


2291 


4077 


5863 


784CIP2BJL7S* 


5739 


506 


2292 


4078 


5864 


784CIP2BJL80 


5740 


507 


2293 


4075 


5665 


784CIP2BJL81 


5744 


50£ 


2294 


4080 


5666 


784CIP2B_182 


5748 


SOS 


2295 


4 081 


5867 


784CIP2B_18j 


5745 


51 0 


2296 


4 082 


5868 


784CIP2B_184 


5750 


51 j 


2297 


4083 


5e69 


7B4CIP2BJL8S 


5750 


512 


2298 


4084 


5870 


7B4CIP2BJL8e 


5750 


513 


2299 


408b 


5871 


784CIP2BJ187 


5761 


514 


2300 


4086 


5872 


784CIP2B 188 ^ 


5762 


Sit 


2301 


4087 


5873 


784CIP2B_18S 


5767 


516 


2302 


4086 


5874 


7B4CIP2B_19C 


5773 


51? 


23 03 


4085 


5875 


7 84CIP2B_193 


5783 


518 


23 04 


4090 


5876 


784CIP2BJL92 


5784 


519 


2305 


4091 


5877 


7 84CIP2B__192 


5788 


52 C 


2306 


4092 


5878 


784CIP2BJL94 


5798 


522 


2307 


4093 


5879 


784CIP2B_19£ 


5807 


522 


2308 


4094 


5880 


784C1P2B_197 


5818 


52;- 


2309 


4095 


5861 


784CIP2B_198 


5819 


524 


2310 


4096 


5882 


r 784CIP2B_19S 


5827 


52J; 


2311 


4097 


5883 


784C1P2B_200 


5828 


526 


2312 


4098 


S884 


7 84CIP2B_20l 


5842 




2313 


4099 


5885 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


784CIP2B_203 


5861 


529 


2315 


4101 


5887 


784CIP2B 204 


5864 


530 


2316 


4102 


5888 


784CIP2B_205 


5865 


531 


2317 


4103 


5889 


784CIP2B_206 


5871 


532 


2318 


4104 


5890 


784CIP2B_207 


5873 


533 


2319 


4105 


5891 


7 84CIP2B_206 


5 873 


534 


2320 


4106 


5892 


784CIP2B_205 


5875 


53b 


2321 


4107 


5893 


784CIP2B_210 


5876 


536 


2322 


4108 


5894 


784CIP2B_21i 


5879 


53? 


2323 


4109 


5895 


784CIP2B_212 


5880 


536 


2324 


4110 


5896 


784CIP2B_213 


5880 


539 


2325 


4111 


5897 


784CIP2B_214 


5880 


54 C 


2326 


4112 


5898 


784CIP2B_215 


5880 


541 


2327 


4113 


5B99 


7 84CIP2B_216 


5885 


542 


2328 


4114 


5900 


784CIP2B_217 


5895 


543 


2329 


4115 


5901 


784CIP28J21B 


5898 


544 


2330 


4116 


5902 


7 84CIP2B_219 


5902 


545 


2331 


4117 


S903 


784CIP2B_220 


5904 


546 


2332 


4118 


5904 


784CIP2B_221 


5918 


547 


2333 


4119 


5905 


784CIP2B_222 


5921 


548 


2334 


4120 


5906 


784CIP2B_223 


5927 


54S 


2335 


| 4121 


5907 


784CIP2B_224 


5932 


55C 


2336 


4122 


5908 


784CIP2BJ225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


' ""552 


2338 


4124 


5910 


784CIP2B_227 


S946 


553 


2339 


412S 


5911 


784CIP2B 226 


S947 


554 


2340 


4126 


5912 _ 


784C1P2B_229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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BNSDOCiD: <WO 0153312A1_I_5 
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SEQ ID NO: 


SEQ ID 


SEO ID NO: 


SEQ ID 


Priori ty 


SEQ ID J 


of full * 


NO : of 


or contig 


NO : 




NO : in I 


1 enath 


lull - 


n\lcl€rOt i.Q6 






U . S .S . N . 


nv ^ I oof l no 

I1L >.i CV< L J.UC 


X cncj t h 






SEQ ID NO : in 


09/488,725 


SCicrucncs 


npn t" i rip- 
v> ^ i — l v. 

secjuencE 






priori tv 
appl i c&tion 




55£ 


2342 


4 126 


5914 


784CIP2B_232 


S975 


55" 


2343 


4 129 


5915 


784CIP2B_233 


5977 


sst 


2344 


4130 


5916 


784CIF2B_234 


5978 


55 ^ 


2345 


4131 


5917 


784CIF2B 235 


5979 


560 


2346 


4132 


5918 


784CIF2B_236 


5980 


56j 


2347 


4133 


5919 


784CIF2B_237 


5588 


562 


2348 


4 134 


5920 


784CIP2B 238 


5989 


56j 


2349 


4135 


5922 


784CiP2B_239 


5991 


564 


2350 


4 136 


5922 


784CIF2B_240 


5997 


565 


23 51 


4137 


5923 


784CIF2B 241 


5998 


566 


2352 


4136 


5924 ! 


784CIF2B 242 


6003 


56'/ 


23 53 


4135 


5925 


784CIF2B 243 


6004 


566 


2354 


4 140 


5926 i 


784CIF2B 244 


6013 


569 


23 55 


4141 


5927 


784C1F2B 245 


6028 


570 


23 56 


4142 


5926 


784C1F2B 246 


602e 


572 


2357 


4143 


5929 


784CJF2B 247 


602 9 


572 


2358 


414 4 


5930 


784C1F2B 248 


6031 


D / J 


235c 


4 14 5 


5931 


7fi4C]?2B 249 


6031 


cn/ 
D /*» 


2360 


414 6 


593 2 


784CIF2B 250 


6032 


cot 
D /- 


23 61 


414 7 


5933 


784CIF2B 251 


6037 


3 /C 


23 62 


414 8 


5934 


784C1F2E 252 


6037 


D / / 


23 63 


4 1 4 9 


593 S 


7B4C2F2B 253 


6043 


C7C, 


2364 


4 150 


593 6 


784CIF2B 254 


6044 


cnc 
D 


23 65 


h i r> j 


5937 


784CIF2B 255 


6046 


580 




4152 






604 B 


582 


ZJD / 




C Q*S C 
~ 7J J 




604 9 


582 




4 1 54 


c qa n 

3 17*4 U 


7flflTTD7R 


6051 


583 


2369 


4 1 5 1 "* 


^94 1 


1 O" vJ 1 LJ A* ~J S 


6053 


co/ 
bot 




9 IDC 


5 942 


1 P4 f*l P2R 7 60 


6060 


585 


£•} li 






T fl4 P7 p7R 7 fi 1 

/ U T V t /.D iVA 


6063 


586 


TIT) 


4 1 5 8 






6066 


587 


2373 


4 1 5 & 


5 94 5 


7fi4PTP7R 263 

/ O a -1 4 ^Jj 4 V J 


6067 


58E 


<c J » h 


416 0 


5 94 6 




6066 


589 


«J '3 


T J. 0 J- 


5 94 7 


7R4CTP2B ?65 


6072 


590 


Z3 1 K> 




594 8 


7H4riP?B 266 


6076 


591 


"2.371 


4163 


594 9 


784C1P9B 267 


6076 




23 7 8 


4164 


S950 


784C3P2B 268 


6077 


r.q* 


2375 


4165 


5 952 


784C1P2B 269 


6075 


594 


Z J D t> 


1 4 *■ 6 6 




7B4C1P2B 270 


6082 


_> — 


23 81 


416 7 


5953 


784CIF2B 272 


6086 


596 


2382 


4166 


5954 


784C1P2B 273 


6091 


597 


23 83 


416 5 


5955 


784CIP2B 274 


6094 


596 


2384 


4270 


5956 


784CIF2B 275 


6101 


599 


2385 


4171 


5957 


784C1P2B 276 


6103 


600 


2386 


4172 


5958 


784C3P2B 277 


6104 


60S 


i 2387 


4i73 


5955 


784CIP2B 278 


6106 


602 


2388 


4174 


5960 


784C1P2E 279 


6112 


503 


2389 


4175 


5961 


784C1P2B 280 


6121 


604 


2390 


4176 


5962 


784CIP2E 281 


6125 


60b 


2391 


4177 


5963 


784CIP2E 262 


6126 


606 


2392 


4178 


5964 


784CIP2B 283 


6128 


607 


2393 


4179 


5965 


784C1P2B 284 


6129 


606 


2394 


4180 


5966 


784C1P2B 285 


6133 


609 


2395 


4181 


5967 


784CIP2E 286 


6133 


610 


2396 


4182 


5968 


784CIP2B 287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139 


612 


2398 


4184 


5970 


784CIP2B 289 


6141 


613 


2399 


4285 


5971 


784CIP28_290 


6145 


614 


2400 


4186 


5972 


784CI?2B_291 


6146 


615 


2401 


4187 


5973 


764CIF2B_292 


6146 


616 


2402 


4188 


5974 


784CIF2B 293 


6149 


617 " ' 


24C3 - 


4189 


5975 


764CIF2B 294 


6149 
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SEQ 3D NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SE0 ID 


Of flill- 


NO: of 


of contig 


NO: 


docket number^ 


NOrin 


ienath 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority j 
application 




616 


2404 


4190 


5976 


784C1F2E_295 


6153 


eis 


2405 


4131 


5977 


784CIP2E_296 


6159 


62C 


2406 


4192 


5978 


784CIP2B_297 


6164 


621 


2407 


4193 


597S 


784C1P2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B_29S 


6172 


622 


2403 


4195 


5981 


784C1P2E_300 


6173 


624 


2410 


41Sb 


5982 


784CIP2B_30I 


6190 


625 


2411 


4197 


5 983 


784CIP2B 302 


6194 


626 


2412 


4190 


5984 


784CIP2B_303 


6196 


627 


2413 


41S9 


5985 


7B4CIP2B 304 


6197 


626 


2414 


4200 


5986 


784CIP2B 305 


61S8 


629 


2415 


4201 


5987 


784CIP2E_306 


6198 


630 


2416 


4202 


5988 


784CIP2B_30B 


6214 


631 


2417 


4203 


5989 


784CIP2B_309 


6215 


632 


2418 


4204 


5990 


784CIP2B_310 


6219 


633 


2419 


4205 


5991 


784C1P2B_311 


6226 


• 634 


2420 


4206 


5992 


784CIP2B 312 


6225 


635 


2422 


4207 


5993 


784CIP2B_313 


6234 


636 


2422 


4208 


5994 


784CIP2E_314 


6237 


637 


2423 


4209 


5995 


784C1F2B_315 


6238 


63S 


2424 


4210 


5996 


764CIP2B_316 


6239 


639 


2425 


4211 


5997 


784CIP25_317 


6239 


640 


2426 


4212 


5998 


784CIP2B 318 


6239 


641 


2427 


4213 


5999 


784C1P2E_319 


6240 


642 


2428 


4214 


6000 


784CIP2B320 


6244 


643 


2429 


4215 


6001 


784CIP2B_321 


6245 


644 


2430 


422b 


6002 


784CIP2E_322 


6250 


645 


2431 


4217 


6003 


784CIP2B_3 23 


6252 


j 646 


2432 


4218 


6004 


784CIP2E_324 


6252 


1 647 


2433 


4219 


6005 


784CIP2B_325 


6256 


64 e 


2434 


4220 


6006 


784CIP2B_32 6 


6260 


I 64? 


2435 


4221 


6007 


784CIP2E_327 


6261 


j 650 


2436 


4222 


6008 


784CIP2E_328 


6264 


651 


243 7 


4223 


6009 


784CIP2E 329 


6265 


652 


2438 


4224 


6010 


784CIP2B_330 


6266 


653 


2439 


4225 


6011 


784CIP2E_331 


6270 


i 654 


2440 


4226 


6 012 


784CIP2E_332 


6271 


655 


2441 


4227 


6013 


784CIP2E_334 


6274 j 


t 656 


2442 


4228 


6014 


764C1P2B_335 


6276 


I 657 


2443 


4229 


6015 


784CIF2B 336 


6281 


658 


2444 


4230 


6016 


784CIP2E_J37 


6281 


659 


2445 


4231 


6017 


784C1P2B_338 


6288 


660 


2446 


4232 


6018 


784CIP2B_339 


6292 


661 


2447 


4 23 - 


6019 


784CIP2B 340 


6294 


; 662 


2446 


4234 


6020 


784CIP2B_343 


6312 


663 


2445 


423£ 


6021 


7fi4CIP2B_344 


6312 


1 664 


2450 


423fc 


6022 


784CIP2E_345 


6312 


FT 7 "? 

665 


2451 


4237 


6023 


784CIP2B346 


6322 


| 666 


2452 


4236 


6024 


784C1P2B_347 


6324 


1 667 


2453 


4239 


6025 


784CIP2B_349 


6329 


1 668 


2454 


4240 


6026 


784CIP2B350 


6331 


i 669 


2455 


4241 


6027 


784CIP2B_351 


6333 


670 


2456 


4242 


6028 


784CIP2B_352 


6334 


671 


2457 


4243 


6 029 


784CIP2B_3S3 


6337 


672 


2458 


4244 


6030 


784CIP23_ 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 


4246 


6032 


784C1P2B_356 


634 8 


675 


2461 


4247 


6033 


784CIP2B_357 


6348 


676 


2462 


4248 


6034 


784CIP2B_3S8 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B_360 


6355 


679 


2465 


4251 


6037 


784CIP2B_36l 


6362 
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£>EQ ID NO: 


SEQ JI> 


Si.Q ID NO; 


SEQ ID 


Priori ty 


SEQ ID 


of full- 


NO : ol 


UI LUIll 1 L 


NO * 


docket nunOoer__ 


NO : in 




full * 


mirl poh i rip 

1JUL J-tULlUt 


vl I— Ul 1 V- Jl M 


(^r^>"T"Pcnr>nrii tin 


U . S . S .N 


nucleotide 


leng th 


seouence 


pept i de 


SEQ ID NO: in 


09/488 , 725 


sequence 


peptide 
sequence 




sequence 


pricritv 
appl i cat ion 




1 680 


2466 


4252 


6036 


784CIP2E_362 


6368 


661 


2467 


4253 


| 6039 


764CIP2B_363 


6365 


682 


2466 


4254 


6040 


764CIP2E_3 64 


6371 


683 


246S 


4255 


6041 


764CIP2B_365 


6376 


684 


2470 


4256 


6042 


764CIP2B_366 


6379 


•~~ 685 


2471 


4257 


6043 


784CI?2B_367 


6380 


686 


2472 


42S6 


6044 


784 CJ P2B_3 68 


6381 


687 


2473 


4255 


6045 


784CIP2B_369 


6392 


688 


2474 


426C 


6046 


784CIP2B 370 


6395 


669 


2471; 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


6048 


784CIP2B 372 


640C 


691 


2477 


4263 


6049 


784CIP2B 373 


6401 


692 


2478 


4264 


6050 


704CIP2B 374 


6411 


693 


247? 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


784CIP2B 376 


6411 


695 


248; 


4267 


6053 


784CIP2B 377 


6416 


696 


2482 


426e 


6054 


784CIP2B 378 


6418 


697 


2483 


426 c 


6055 


784CIP2B 379 


6422 


696 


24B4 


4270 


6056- 


784CIP2B 380 


6423 


695 


24 8h 


4271 


6057 


784CIP2B 381 


6426 


700 


24 86 


4272 


60S8 


7S4CIP2R 3fl2 


64 27 


701 


24B7 


4272 


6059 


784CIP2B 383 


6426 


702 


24 98 


4274 


6060 


784CIP2B 384 


6429 


703 


2489 


4 275 


6061 


784CIP23 385 


6430 


704 


2490 


4276 


6062 


7R4CTP9R 386 


6432 


70S 


24 93 


4277 


6063 


fO'H.if JO i 


64 3 2 


706 


2492 


4276 


6064 




64 3 6 


707 


2493 


4275 


6065 


7R4PTP9R 389 


6441 


708 


24 94 


4 28G 


6066 


784C3P2B 3 90 


6446 


709 


2495 


4281 


6067 


784CIP2B 391 


64 54 


710 


2496 


4282 


6066 




6459 


711 


2497 


4283 


6069 


7B4C7 PPB *\ 94 


6461 


712 


2496 


4 284 


6070 


784CIP2B ^9^ 


64 67 


713 


249P 


4285 


6071 


784C1P2B 396 


6468 


714 


2500 


4286 


6072 


784C1P2B 397 


6487 


715 


2501 


4287 


6073 


784CIP2B 398 


6491 


716 


2502 


4286 


6074 


784C1P2B 399 


6506 


717 


2503 


4289 


6075 


784CIP2B 401 


6514 


71E 


2504 


4290 


6076 


784CIP2B 402 


6519 


719 


2505 


4291 


6077 


784CIP2B 403 


6521 


720 


2506 


4292 


6078 


784CIP2B 404 


6532 


721 


2507 


4293 


6079 


784CIP2B 405 


6536 


722 


2508 


4294 


6080 


784CIP2B 406 


6543 


723 


2509 


4295 


6081 


784CIP2B 407 


6544 


724 


2510 


4296 


6082 


784CIP2B 408 


654 8 


725 


2513 


4297 


6083 


784CIP2B 409 


6553 


726 


2512 


4298 


6084 


784C1P2B 410 


6553 


727 


2513 


4299 


6085 


784CIP2B_411 


6552 


728 


2514 


4300 


6086 


784CIP2B_412 


6554 


729 


2515 


4301 


6087 


784CIP2B_413 


6556 | 


730 


2516 


4302 


6088 


784CIP2B_414 


6560 


731 


2517 


4303 


6089 


784CIP2B_415 


6563 


732 


2518 


4304 


6090 


784CIP2B_416 


6564 


733 


2519 


4305 


6091 


784CIP2B_417 


6567 


734 


2520 


4306 


6092 


784CIP2B_418 


6572 


735 


2521 


4307 


6093 


784CIP2B 419 


6575 


736 


2522 


4308 


6094 


784CIP2B_420 


6577 


737 


2523 


4309 


6095 


784CIP2B_421 


6593 


738 


2524 


4310 


6096 


784CIP2E 422 


6595 


739 


2525 


4311 


6097 


764CIP2B_423 


6599 


740 


2526 


4312 


6098 


784CIP2B_424 


6625 


741 


2527 


4313 


6099 


784CIP2B_425 


6625 
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SEQ ID NO: 


SEQ IV 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contiq 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


iengtr. 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




•742 ■ 


252i 


4314 


6100 


784CIP2B_426 


6626 


743 


2525 


4315 


6101 


764CIP2B_427 


663C 


744 


253C 


4316 


6102 


784CIP2B_428 


6631 , 


745 


253 j 


4317 


6103 


784CIP2B_429 


66*32 


746 


253; 


4318 


6104 


784CIP2B_430 


6633 


747 


2533 


4319 


6105 


784CI?2B_431 


6634 


748 


2534 


4320 


6106 


784CIP2B_4 32 


6638 


74 9 


253' 


4321 


6107 


784CIP2B__433 


6643 


750 


253fc 


4322 


6308 


764CIP2B_434 


6644 


751 


253-/ 


4323 


6109 


784CIP2B_43S 


6 64t 


752 


253fc 


4324 


6110 


784CIP2B_436 


6646 


753 


2535 


4325 


6111 


784CIP2B_437 


6652 


754 


2540 


4326 


6112 


784CIP2B_436 


6654 


755 


254j 1 


4327 


6113 


784CIP2B_435 


6657 


756 


i 2542 


4328 


6114 


784CJP2B_440 


6656 


757 


254 3 


4329 


6115 


784C1P2B_441 


6663 


7S8 


2544 


4330 


6116 


784CIP2B_442 


6664 


■759 


254S 


4331 


6117 


784CIP2B_443 


6666 


760 


254 (v 


4332 


6118 


7B4C3P2B_444 


6665 


761 


2547 


4333 


6119 


7B4CIP2B_445 


6673 


762 


2548 


4334 


6120 


784CIP2B_446 


6685 


763 


2545 


433S 


6121 


784C1P2BJ147 


6687 


764 


25 5 C 


4336 


6122 


784CIP2B_44£ 


6689 


765 


2551 


4337 


6123 


784CIP2B_44S 


6693 


766 


255k 


4336 


6124 


784CIP2B_450 


6698 


767 


255; 


4339 


6125 


784C1P2B_4S1 


6695 


768 


2554 


434C 


6126 


784CIP2B_452 


6705 


769 


2S5i 


4343 


6127 


784CIP2B_453 


6711 


770 


255* 


4342 


6126 


784CIP2B_4S4 


6713 


771 


255-/ 


4343 


6125 


784CIP2B__455 


6716 


772 


255b 


4344 


6130 


784CIP2B_456 


6725 


773 


255 5 


4345 


6131 


784CIP2B_457 


6726 


774 


2560 


4346 


6132 


784CIP2B_-4 58 


6727 


775 


2562 


4347 


6133 


784CIP2B 459 


6730 


776 


256: 


4348 


6134 


784CIP2B_460 


6730 


777 


2563 


4349 


6135 


784CIP2B_461 


6730 


778 


2564 


4350 


6136 


784CIP2B_462 


6732 


779 


25SI 


4351 


6137 


7 84CIP2B_463 


6733 


780 


256£ 


4352 


6136 


784CIP2B_464 


6737 


781 


2567 


4353 


6135 


784CIP2B_4 65 


674£ 


782 


2561- 


4354 


614C 


784CIP2B_466 


6751 


783 


256S 


4355 


6141 


784CIP2B_467 


6754 


784 


2570 


4356 


6142 


7 84CIP2B_4 6 8 


6758 


785 


2571 


4357 


6143 


784CIP2B_469 


6761 


786 


2572 


4358 


6144 


784CTP2B_470 


6765 


787 


2573 


4359 


6145 


784CIP2B_471 


6768 


788 


2574 


4360 


6146 


784CIP2B_472 


6773 


789 


2575 


4361 


6147 


784CIP2B_473 


6776 


790 


257t 


4362 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_4 75 


679£ 


792 


.2576 


4364 


6150 


784CIP2B_476 


6823 


793 


2575 


~ 4365 


6151 


784CIP2B_477 


6825 


794 


2580 


4366 


6152 


784CIP2B_478 


6826 


795 


258: 


4367 


6153 


784CIP2B_479 


6835 


796 


2582 


4368 


6154 


784CIP2B_4 80 


6844 


797 


2583 


4369 


6155 


784CIP2B_482 


684S 


798 


2584 


4370 


6156 


784CIP2B_483 


6854 


799 


256b 


4371 


6157 


784C1P2B 484 


6857 


800 


2586 


4372 


6158 


784CIP2B_48S 


6861 


801 


25B7 


4373 


6159 


784CIP2B_4 86 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2585 


4375 


6161 


784CIP2B_4 88 


6877 



283 



BNSDOCIO: <WO 0l533l2A1_l_> 



WO 0J/533I2 



PCT/USOO/34263 



SEC? 3u NO: 


SEQ IE 


s>hQ ID NO: 


SEQ IE- 


Priority 


SEQ ID 


or iUii • 


NO : of 


of con tig 


Mr"*. . 


docket nurrvber^ 


NO : in 


leno t h 


full - 


llUt J-CUL -I Lit— 


of conticf 


corresponaincj 


TT C C K! 


nu c 1 e o t i dc 


1 enoyth 


sequence 




SFO ID NO- in 




sequence 


peptide 




secuence 


pri ority 






sequence 






appl icati on 




804 


2590 


437e 


6262 


784CIP2B 489 


6880 


8 05 


2593 


4377 


616;- 


784C1P2B 490 


68e5 


806 


2592 


4376 


616 4 


784CIP2B_49l 


6890 


807 


2593 


4379 


6265 


784C1P2B 492 


6890 


806 


2594 


4380 


626t 


7S4CIP2B_493 


6894 


BC9 


2595 


4382 


616*: 


784CIP2B 494 


690! 


810 


259 e 


4382 


6166 


784C1P2B 495 


6904 


811 


2597 


4383 


6165 


784CIP2B__496 


6507 


812 


2596 


4384 


627C 


784CIP2B 497 


6914 


813 


2599 


4385 


6173 


784CIP2B 498 


6917 


814 


2600 


438C 


637; 


784C1P2B 499 


6523 


815 


2603 


4387 


617? 


784CIP2B 500 


6925 i 


816 


2602 


4388 


6174 


784CIP2B 501 


6932 


an 


2603 


4 389 


627; 


784CIP2B 502 


6935 


816 


2604 


4390 


6176 


7B4CIP2B 503 


6940 


815 


2605 


4391 


5177 


784CIP2B 504 


6945 


82C 


260 6 


4 392 


617P 


784CIP2B 505 


6946 


923 


2607 


4393 


6 175 




6 94 7 


822 


2606 


4394 


618C 


7S4CIP2B 507 


694 5 


823 


260S 


4395 


6183 


784CIP2B 508 


6955 


824 


2610 


4396 • 


6 182 


784CIP2B 509 


6960 


625 


2611 


4 3 97 


6183 


784CTP7B mo 


6962 


826 


2612 


4396 


6164 


784CTP7B 511 


6963 


827 


2613 


4399 


6185 




6967 


826 


2614 


4 4 00 


6 1 86 


"}fi4CT P7B SI ^ 


6983 


825 


261 fe 


4 4 03 


6137 




6986 


830 


2616 


44 02 


6136 


784rTP7P, 51 5 


6996 


831 


2617 


4403 


6195 


7R4 CTP7R 51 


7 00* 3 


832 


2616 


44 04 


6 1 90 


7S4CTP7B 51 7 


701fc 


833 


2619 


44 05 


6 1 9'j 


7R4CTP7B 518 


7017 


834 


262 0 


4406 


6192 


7fi4CTP7R 51 9 


7025 


835 


2621 


4407 


6193 


784CTP2B 520 


7025 


836 


2622 


44 06 


6194 


7R4CTP2B 591 


7025 


837 


2623 


44 09 


61 9^ 


7H4CTP2R 577 


7050 


838 


2624 


4410 


619t 


784CIP2B 523 


7051 


83S 


2625 


4413 


6197 


7fl4CTP7B 574 


7055 


840 


2626 


4412 


6198 


7 84CIP7B 575 


7060 


841 


2627 


4412 


6195 


7R4CTP7R 57fi 


7064 


842 


2628 


4414 


62 00 


784CIP2B 577 


7067 


843 


2629 


4415 


6201 


784CIP2B 528 


7071 


644 


26*30 


4416 


6202 


784CIP2B 529 


7072 


845 


2631 


4417 


6203 


784CIP2B 530 


7073 


846 


2632 


44ie 


6204 


784CIP2B 532 


7076 1 


84 7 


2633 


4419 


6205 


784CIP2B 532 


7088 


848 


2634 


442C 


6206 


784CIP2B 533 


7089 


84S 


2635 


4423 


6207 


784CIP2B 534 


7092 


850 


2636 


4422 


6206 


784CIP2B 535 


7091 


851 


2637 


4423 


6205 


7B4CIP2B_536 


7104 


852 


2638 


4424 


6210 


784CIP2B__537 


7105 


853 


2639 


4425 


6231 


784CIP2B_536 


7105 


854 


2640 


4426 


622i 


7B4CJP2B_53S 


7109 


855 


2641 


4427 


6213 


784CIP2B 540 


7109 


856 


2642 


4428 


6234 


784CIP2B_541 


7119 


657 


2643 


4429 


6225 


784CIP2B_542 


7120 


858 


2644 


443C 


6226 


784CIP2B 543 


7121 


65S 


2645 


4433 | 


6227 


784CIP2B 544 


7126 


660 


2646 


4432 


622fc 


784C1P2B_545 


7127 


86a 


2647 


4433 


6215 


784CIP2B546 


7130 


862 


2648 


4434 


6220 


784CIP2B_547 


7131 


863 


2649 


4435 


6223 


784CIP2B_548 


7144 


864 


2650 


4436 


6222 


7B4CIP2B_549 


7159 


865 


2651 


4437 


6223 


784CIP2B_550 


7163 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID WO: 


SEQ ID 


Priority 


SEQ ID 


of ful3> 


NO: Ot 


of contic 


NO 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




86( 


2652 


4436 


622<: 


7B4CIP2B_551 


7175 


861 


2653 


443£ 


6225 


784CIP2B_552 


7188 


86E 


2654 


444C 


6226 


784CIP2B_553 


7189 


865 


2655 


4441 


6227 


784C1P2B__554 


7190 


870 


2656 


4442 


622fc 


784CIP2B_555 


719} 


871 


2657 


4443 


6225 


784CIP2B_556 


7203 


871 


2656 


4444 


623 0 


784CIP2B_557 


7204 


873 


2659 


4445 


6231 


704CIP2B_55Q 


7208 


874 


• 2660 


4446 


6232 


784CIP2B_559 


7205 


875 


2661 


4447 


6233 


784CIP2B_560 


7210 


876 


2662 


4448 


6234 


7B4CIP2B_561 


7216 


877 


2663 


4 449 


6235 


784CIP2B_562 


7221 


876 


2664 


4450 


6236 


784CIP2B_563 


7230 


875 


2665 


445} 


6237 


784CIP2B_564 


7237 


880 


2666 


4452 


623 fc 


784CIP2B_565 


7240 


h 881 


2667 


4453 


6239 


784CIP2B_S66 


724 5 


882 


2666 


4454 


624 0 


784CIP2B_567 


7250 


883 


2665 


4455 


6241 


7B4CIP2B_568 


7251 


884 


267C 


4456 


6242 


784CIP2B_569 


7255 


B8S 


2671 


4457 


624 3 


784CIP2B_570 


7260 


886 


2672 


4458 


6244 


784CIP2B_571 


7265 


88*; 


2673 


4459 


6245 


784CIP2B_572 


7268 


88£ 


2674 


4 4 60 


624 6 


784CrP2B_573 


7275 


865 


2675 


4461 


624 7 


784CIP2B 574 


7279 


890 


2676 


4462 


624£ 


784CIF2B_575 


7283 


891 


2677 


4463 


6245 


784CIF2B_576 


7283 


892 


2678 


4464 


62SC 


784CIP2B 577 


7287 


893 


2675 ■ ■ 


4465 


6251 


784CIP2B_578 


7301 


894 


2680 


4466 


6252 


784CIP2B 579 


7308 


B95 


2681 


4467 


6253 


784CIP2B_5B0 


7306 


896 


2682 


4468 


62S4 


784C1P2B_581 


7309 


897 


2683 


4469 


6255 


784CIP2B_S82 


7319 


898 


2684 


4470 


6256 


7B4CIP2B 583 


7320 


899 


2685 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 


6256 


784CIP2B_585 


7326 


901 


2687 


4473 


6255 


784CIP2B_586 


7334 


902 


2688 


4474 


626C 


784CIP2B 587 


7337 


902 


2689 


4475 


6261 


784CIP2B_58B 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


90S 


2691 


4477 


*6263 


784CIP2B_590 


73S5 


"906 


2692 


4476 


6264 


784CIP2B 591 


7363 


907 


2693 


4479 


6265 


784CIP2B_592 


7363 


908 


.2694 


4480 


6266 


784CIP2B_593 


7365 


905 


2695 


4481 


6267 


784CIP2B_594 


7368 


910 


2696 


4482 


~ 6268 


784CIP2B_595 


7369 


911 


2697 


4483 


6265 


784CIP2B_596 


7372 


912 


2698 


4484 


6270 


784CIP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B_600 


7381 


914 


2700 


4486 


6272 


784CIP2B 601 


7383 


915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B_603 


7391 


917 


2703 


4489 


627S 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B_605 


7395 


919 


2705 


4491 


6277 


764CIP2B_606 


7397 


920 


2706 


4492 


6278 


7B4CIP2B_607 


7395 


921 


2707 


4493 


627S 


784CIP2B_608 


7405 


922 


2708 


4494 


6280 


784CIP2B_609 


7406 


923 


2709 


4495 


6281 


784CIP2B_610 


7406 


924 


2710 


4496 


62B2 


784CIP2B_611 


7405 


925 


2711 


4497 


6283 


784CIP2B_612 


7410 


926 


2712 


4498 


6284 


784CIP2B_613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 



285 



BNSDOCID: <WO_0153312A1 J_> 



WO 01/53312 PCT/US00/34263 



SEO ID NO: 1 


SKQ ID 


S2Q ID NO: 


SEO ID 1 


Priority j 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket nunber_ | 


NO: in 


i ength 


full- 


nucleotide 


of contig 1 


corresponding 1 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 1 


SEO ID NO: m 1 


09/488,725 


fccquencc 


peptide 
sequence 




sequence 


priority ; 
application | 




52 E 


2714 


4 SCO 


6286 


784CIP2B_615 - 


7418 




2715 


4503 


6287 


784CIP23_6l6 


7421 


920 


2716 


4 5C2 


6286 


784CIP23_617 j 


7422 


933 


2717 


4 soy 


6289 


784CIP23618 ] 


7422 


932 


2718 


4 504 


6290 


784CIP2B_619 ! 


7423 


933 


2 719 


4 505 


6291 


784CIP23620 


7424 


934 


2720 


45C6 


6292 


784CIP2B_621 i 


7426 


235 


2721 


45C7 


6293 


784CIP2S622 \ 


7427 


936 


2722 


4506 


6294 


784CIP23_623 


7428 


S3? 


2723 


4509 


6295 


784CIP23624 


7430 


938 


2724 


4 510 


6296 


784CIP23_625 


7435 


93 5 


2725 


4511 


6297 


784CIP2B_626 


7437 


940 


2726 


4512 


6298 


784CIP2B_627 


7439 


94 a 


2727 


4513 


6299 


784CIP2B_628 


7440 


942 


2728 


4514 


6300 


784C1P2362S 


7442 


943 


2729 


4515 


6301 


784CIP23_630 


7450 


944 


2730 


451t 


6302 


784CIP23_63I 


7451 


945 


2731 


4517 


6303 


784C1P2B_632 


7452 


94b 


2732 


45lt 


6304 


784CIP23633 


7454 


94 V 


2733 


4519 


630 5 


784CIP2B_634 


7457 


94e 


2734 


4 520 


6306 


784CIP2E 635 


7459 


949 


2735 


4523 


6307 


784CIP2B_636 


7461 


950 


2736 


4522 


6308 


784CIP23_ fc37 


7463 


953 


2737 


4523 


6309 


784CIP2B_638 


7466 


952 


2738 


4524 


6310 


784CIP2B 639 


7469 


953 


2739 


4525 


6311 


784CIP23_64 0 


7473 


y54 


2740 


452b 


6312 


7B4CIP2B_641 


7481 


955 


2741 


4527 


6313 


784CIP2B_642 


7482 


956 


2742 


452fr 


6314 


784C3P23643 


7482 


957 


2743 


4525 


6315 


784C1P2B_644 


7483 


958 


2744 


4530 


6316 


784CIP2B_645 


7485 


959 


2745 


4533 


6317 


784CIP23_646 


74B6 


960 


2746 


4532 


6318 


784C3P2B_64 7 


7487 


961 


2747 


453 3 


6319 


784C1P23_648 


7491 


962 


2748 


4534 


6320 


784CIP23_64S 


7492 


963 


2749 


4535 


6321 


784CIP2B_650 


7494 


964 


2750 


4536 


6322 


784C2P23_651 


7498 


965 


2751 


4537 


6323 


784C3P2B_652 


7504 


966 


2752 


4536 


6324 


784CIP23_653 


7508 


967 


2753 


4539 


6325 


784C1P2B_654 


7516 


966 


2754 


4540 


' 6326 


?84CIP2B_655 


7518 


969 


2755 


4543 


6327 


784CIP2B_656 


7519 


970 


2756 


4542 


! 6328 


784CIP2B^657 


7521 


971 


2757 


4543 


| 6329 


784CIP23658 


7529 


972 


2758 


4544 


6330 


784CIP2B_659 


7532 


973 


2759 


454 5 


6331 


784CIP23 660 


7533 


974 


2760 


4546 


6332 


784CIP2B_661 


7535 


975 


2761 


454""> 


6333 


784CIP23_662 


7545 


S76 


2762 


454 6 


6334 


784CIP23 663 


1 7546 


977 


2753 


4545 


633S 


784C1P2B_664 


7552 


978 


2764 


4550 


6336 


784C1P2B_665 


7554 


975 


2755 


4553 


6337 


784C3P2B 666 


7567 


980 


2766 


4552 


6338 


784CIP23_667 


7569 


981 


2767 


4553 


6339 


784CIP2B 666 


7575 


982 


2768 


4554 


6340 


784C1P23_669 


7576 


983 


2763 


4555 


6341 


784C1P23_670 


7577 


984 


2770 


4556 


6342 


784C1P2B_673 


7579 


985 


2771 


4S57 


6343 


784C2P23_672 


7582 


986 


2772 


4556 


6344 


784C1P2B_673 


7587 


987 


2773 


4555 


6345 


784C1P23_674 


7589 


986 


2774 


4 560 


6346 


784C1P2B_675 


7597 


98S 


2775 


4561 


6347 


7B4CXP2B 676 


7597 



286 



BNSDOCID: <W0 0153312A1.L - 



WO 01/53312 



PCT/US<N>/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of fuli- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


cf contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID MO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




990 


2776 


4562 


6346 


784C1P23_677 


7609 


591 


2777 


4 5e:- 


6345 


784C1P23_676 


7609 


991 


2778 


4564 


6350 


784CIP2B_679 


7609 


932 


2779 


4565 


63S1 


784CIP2B_680 


7613 


994 


2780 


4566 


6352 


784C1P23_661 


7623 


995 


2781 


4567 


6353 


784C1P23_682 


7629 


59£ 


2792 


4568 


63 54 


784C1P2B_683 


7630 


997 


2783 


4565 


6355 


784CIP2B_684 


7633 


996 


2784 


4570 


6356 


784CIP2B6 85 


7635 


995 


2785 


4573 


6357 


784CIP2B_686 


7638 


1000 


2786 


4572 


6358 


784CIP2B 687 


7639 


1001 


2787 


4573 


6359 


784CIP2B_688 


7646 


1002 


2788 


4574 


6360 


784CIP2B68S 


7647 


1003 


2709 


4575 


6361 


784C1P2B_690 


7648 


1004 


2790 


4576 


6362 


784CIP2B_691 


76S8 


1005 


2791 


4577 


6363 


784C1P2B_692 


7664 


1006. 


2792 


4578 


6364 


784C.IP2B._693 


7664 


1007 


2793 


4575 


6365 


784CIP2B_695 


7674 • 


1006 


2794 


4580 


6366 


784C1P2B_696 


7675 


1009 


2795 


458i 


6367 


784CIP2B 697 


7676 


1010 


2796 


4582 


6368 


784CIP2B_698 


7681 


1013 


2797 


4503 


6369 


784CIP2B_699 


7668 


1012 


2798 


4584 


6370 


784CIP2B_700 


7693 


1013 


2799 


4585 


6371 


784CIP2B_701 


7694 


1014 


2800 


4586 


6372 


784CIP2B_702 


7715 


1015 


2801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2802 


4588 


6374 


7B4CIP2B__704 


7718 


1017 


2803 


4585 


6375 


784CIF2B 705 


7721 


1018 


2804 


4 590 


6376 


784CIP2BJ706 


7723 


1019 


2805 


4593 


6377 


784CIP2B 707 


7729 


1020 


2806 


4 592 


6378 


784CIP2B_708 


"7733 


1021 


2807 


4593 


6379 


784CIP2B_709 


7735 


1022 


2808 


4594 


6380 


7B4CIP2B_710 


7741 


1023 


2809 


459b 


6381 


784CIP2B 711 


7743 


1024 


2810 


4596 


6382 


784CIP2B_712 


774 6 


1025 


2811 


4557 


6383 


784CIP2B_713 


7745 


1026 


2812 


4598 


63B4 


784CIP2B 714 


7750 


102 7 


2813 


4595 


6385 


784CIP2B715 


7757 


1028 


2814 


4600 


6386 


784CIP2B__716 


7759 


1025 


2815 


• 4601 


6387 


784CIP2B__717 


7760 


1030 


2816 


4602 


63B8 


784CIP2B_716 


7760 


1031 


2817 


4603 


6389 


784CIP2B_719 


7764 


1032 


2818 


4604 


6390 


764CIP2B_720 


7765 


1033 


2819 


4605 


6391 


784CIP2B__721 


7766 


1034 


2820 


4606 


6392 


7B4CIP2E_722 


7767 


1035 


2821 


4607 


6393 


784CIP2E__723 


7769 


1036 


2822 


4608 


6394 


784CIP2B__724 


7770 


1037 


2823 


4605 


6395 


784CIP2B_725 


7774 


103 8 


2824 


4610 


6396 


784CIP2B_726 


7779 


103S 


2825. 


46n 


6397 


784CIP2B 727 


7781 


1040 


2826 


4612 


6398 


784CIP2B_728 


7782 


1041 


2827 


4613 


6399 


784CIP2B_729 


7783 


1042 


282e 


4614 


6400 


784CIP2B_730 


7787 


1043 


2825 


461S 


6401 


784CIP2BJ731 


7792 


1044 


2830 


4616 


6402 


784CTP2B_732 


7795 


1045 


2831 


4617 


6403 


784CIP2BJ733 


7801 


1046 


2832 


4618 


6404 


784CIP23 734 


7807 


1047 


2833 


4615 


6405 


\ 784CIP23_735 J 


7008 


1048 


2B34 


4620 


6406 


784CIP23_736 


7819 


1049 


2835 


4623 


6407 


784C1P2B_737 


7824 


105C 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


4623 


6409 


784CIP2B_739 


7829 
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BNSDOCID: <WO__0153312A1_L> 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ 3D 


of full- 


NO: of 


ot contig 


NO: 


docket number_ 


NO:in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide . 
sequence 




sequence 


priority 
application 




1052 


2836 


4624 


64ic 


784CIP2B_740 


7832 


ICS 3 


283S 


4 625 


6411 


764CIP2B_741 


783S 


1054 


2840 


4626 


6412 


764CIP2B_743 


7847 


i 1055 


2841 


4627 


6413 


754CIP2E_744 


7848 


! 1056 


2642 


462f 


6414 


784CIF2E_745 


7B53 


1C57 
, ~-— 


2843 


4629 


6415 


784CIP2B 746 


7854 


, 1058 


2844 


4630 


64 16 


764CIP2B 747 


7856 


105S 


2845 


4631 


6417 


704CIP2B_74 8 


7862 


1060 


2846 


4632 


6418 


764CIP2B_749 


7865 


io6i 


2847 


4633 


6419 


784CI?2B_7S0 


7874 


1062 


2848 


4634 


6420 


784CIP2BJ75I 


7877 


1063 


2849 


4635 


64 21 


784CIP2B_752 


7880 


1064 


2850 


4636 


6422 


7B4CIP2B 753 


7882 


1065 


2851 


4637 


6423 


7B4CIP2B_754 


7884 


1066 


2852 


4638 


6424 


784C3P2B_755 


7886 


1067 


2853 


4639 


6425 


784CIP2B_756 


7886 


1066 


2854 


4640 


6426 


784CIP2B_757 


7889 


1069 


285S 


4641 


6427 


784C1P2B 758 


7901 


1070 


2856 


4642 


6428 


784CIP2B_759 


7910 


1073 


2857 


4643 


6429 


784CIP2B_760 


7911 


1072 


2856 


4644 


6430 


784CIP2B_761 


7921 


1073 


285$ 


4645 


6431 


784CJP2B_762 


7923 


1074 


286C 


4646 


6432 


704C3P2B_763 


7924 


1075 


2061 


4647 


6433 


784C3F2BJ764 


7925 


1076 


2862 


4 64 8 


6434 


784C3P2BJ765 


7928 


1077 


2863 


4649 


6435 


784C3P2B_766 


7929 


1078 


2864 


4650 


6436 


784CIP2B_767 


7930 


1075 


2865 


4651 


6437 


784C1P2B_76B 


7934 


1080 


2866 


4652 


6436 


7S4C3P2B_769 


7938 


1081 


2867 


4653 


6439 


784C1P2B_770 


7942 | 


1082 


2866 


4654 


6440 


784C3P2B_77l 


7945 , 


1083 


2869 


4655 


6441 


784CIP2B_772 


7946 j 


1064 


2870 


4656 


6442 


784CIP2B_773 


7548 


10B5 


2871 


4657 


644? 


784CIP2B_774 


7951 


10B6 


2872 


4658 


6444 


784CIP2B_775 


7952 


1087 


2873 


4659 


6445 


784CIP2B„_776 


7953 


1088 


2874 


4660 


6446 


784CIP2B_777 


7954 


1089 


2875 


4661 


6447 


784CIP2B 778 


7957 


1090 


2876 


4662 


6446 


784CIP2B > _779 


7958 


1091 


2877 


4663 


6445 


784CIP2B_790 


7961 


1092 


2878 


4664 


6450 


784CIF2B_781 


7965 ' 


1093 


2879 


4665 


6451 


784CIP2B_782 


7S66 


1094 


2880 


4666 


6452 


784CIP2B_783 


7979 


1095 


2881 


4667 


6453 


784CIP2B_784 


7986 


1096 


2882 


4668 


6454 


784CIP2B_785 


7986 


1097 


28B3 


4669 


6455 


784CIP2B_786 


7988 


1098 


2884 


4670 


6456 


784CIP2B 787 


7991 


1099 


2885 


4671 


6457 


784CIP2B_788 


7992 


1100 


2886 


4672 


6458 


7B4CIP23_789 


7992 


1101 


2887 


4673 


6459 


784CIP2B_790 


7992 


1102 


2 888 


4674 


6450 


784CIP2B_79l 


7992 


1103 


2889 


4675 


6461 


784CIP2B_792 


8003 


1104 


2890 


4676 


6462 


7B4CIP2B_793 


8014 


1105 


2891 


4677 


6463 


784CIP2B_794 


8015 


1106 


2892 


4678 


6464 


784CIP2B_795 


8016 


1107 


2893 


4679 


6465 


784CIP2B 796 


8017 


1106 


2894 


4680 


6466 


784CIP2B_797 


8019 


1109 


2895 


4681 


6467 


764CIP2BJ798 


8020 


1110 


2896 


4682 


6469 


784CIP2BJ799 


8022 


1111 


2697 


4683 


6469 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B_801 


8028 


! 1113 


2899 


4685 


6471 


784CIP2B__802 


8030 
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BNSDOCID: <WO 0153312A1J.? 



WO 01/533 J 2 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


SEQ ID 1 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 1 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. i 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 i 


sequence 


peptide 
sequence 




sequence 


priority 
application 


1 

1 


1114 


2901 


4686 


6472 


784CIP2B603 


8038 


111S 


2901 


4687 


6473 


784CI P2E_ 804 


8042 


1116 


2902 


4686 


6474 


784C1P2E_805 


8045 j 


1117 


2903 


4689 


6475 


784CIP2B 806 




1116 


2904 


4690 


6476 


784CIP2B_807 


8046 j 


1119 


2901 


4691 


6477 


784CIP2B_808 


8047 ! 


1120 


2906 


4692 


6478 


784CIP2B_609 


8051 


1121 


2907 


4 693 


6479 


784CIP2B_810 


8059 


1122 


2908 


4694 


6480 


764CIP2B_811 


8064 


1123 


2909 


4695 


6481 


784CIP23__812 


8069 


1124 


2910 


4696 


64B2 


784CIP23_813 


8074 


1125 


2911 


4697 


6483 


784C1P2B_814 


8077 


1126 


2912 


4698 


6484 


784CIP23_815 


8078 


1127 


2913 


4699 


6485 


784CIP25_816 


8079 


1126 


2914 


4 700 


6466 


784CIP2B_817 


8084 


1129 


291S 


4701 


6467 


784CIF2B_818 


; 8088 


1130 


2916 


4 702 


6488 


784CIP2B_819 


| 8090 ! 


1131 


2917 


4703 


6489 


7B4C1P2B_820 


8091 | 


1132 


2918 


4704 


6490 


784CIP2B_821 


8099 


1133 


2919 


4705 


6491 


784C1P2E_822 


8099 | 


1134 


2920 


4706 


6492 


784CIP2B_823 


6100 I 

. 1 


1135 


2921 


4707 


64 93 


784C1P2B_824 


8102 


1136 


2922 


4706 


6494 


784CIP2B_825 


8103 


•1137 


2923 


470S 


6495 


784CIP2B_826 


8103 j 


1138 


2924 


4710 


6496 


784CIP2B827 


8104 


1139 


2925 


4711 


6497 


784CIP2B_828 


8108 


1140 


2926 


4712 


6498 


784CIP2B_829 


8110 


1141 


2927 


4 713 


6495 


784CIP2B_830 


8136 


114 2 


2925 


4714 


6500 


784CIP2B_831 


1 8117 


1143 


2929 


4715 


5501 


784CIP2B832 


8123 


1144 


2930 


4716 


S502 


7B4CIP2B_833 


8130 


Z145 


2931 


4717 


6503 


784CIP2B_834 


8130 


1146 


2932 


4716 


6504 


784CIP2B_B3S 


8143 


1147 


2933 


4719 


650S 


7S4CIP2B_J36 


8143 


1148 


2934 


4720 


6506 


784CIP2B_837 


8154 


1149 


2935 


4721 


6507 


784CIP2B_838 


8155 


1350 


2936 


4 722 


6508 


784CIF2B_839 


8162 


1151 


. 2937 


4723 


6509 


784CIP2B_840 


8163 


1152 


2938 


4 724 


6510 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CIP2B842 


8173 


1154 


2940 


4726 


6512 


784CIP2B843 


817S- 


1155 


2941 


4727 


6513 


784CIP2B_844 


8162 


1156 


2942 


4728 


6514 


784CIP2B_845 


8183 


1157 


2943 


4 729 


6515 


784CIP2B 846 


8164 


1158 


2944 


4730 


6516 


784CIP2B_84 7 


8185 


215? 


2945 


4 731 


6517 


784CIP2B_84 8 


8187 


1160 


2S46 


4732 


6518 


784CIP2B_849 


8186 


1 1161 


294 7 


4 733 


6519 


784CIP2B_850 


8190 


1162 


2548 


4734 


6520 


784CIP2B_851 


8150 


1163 


2949 


4735 


6521 


784CIP2B_852 


8192 


1164 


2950 


4736 


6522 


784CIP2B853 


8193 


1165 


2951 


4737 


6523 


7B4CIP2B854 


8197 


1166 


2952 


4738 


6524 


764CIP2B_855 


8197 


1167 


2953 


4739 


6525 


764CIP2B_856 


8195 


1168 


2954 


4740 


6526 


784CIP2B_857 


8202 


1169 


2955 


4741 


6527 


784CIP2B858 


8203 


JLJL r\J 




4742 


6528 


7B4CIP2B_859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B_860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


2959 


4745 


6531 


784CIP2B_862 


8214 


1174 


2960 


4746 


6532 


784CIP2B^863 


8217 


1175 


2961 


4747 


6533 


784CIP2B_864 


8223 
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BNSDOCIO: <W0 0153312A1 _!_> 



WO 01/53312 PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ 11 


Priority 


SEQ IV 1 


of full- 


NO: oJ 


oi contig 


NO: 


docket number _ 


NO: in ; 


length 


fuli- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptidc- 


SEQ ID NO: in 


09/488,72? 


sequence 


peptide 
sequence 


- 


sequence 


priority 
appl i cation 


1 


1176 


.296:' 


4748 


6534 


784CIP2B_865 


8224 ! 


1177 


296i 


4749 


6535 " 


784CIP2B_866 


8226 - 


1178 


2964 


4750 


653< 


784CIP2B_867 


8227 


1179 


296i 


4751 


6537 


784CIP2B_868 


8229 


1180 


2 966 


4752 


653f 


784CIP2B_869 


8232 


11B1 


2967 


4753 


653 9 


784CIP2B_B70 


8236 


1182 


2968 


4754 


654C 


784CIP2B 871 


8239 


1103 


2965; 


4755 


6541 


784CIP2B_872 


8244 


1184 


2970 


4756 


654: 


784CIP2B_873 


8245 J 


1185 


2973 


4757 


654? 


784CIP2B_874 


8246 


1186 


297Z- 


4758 


6544 


784CIP2B_875 


825- 


1187 


297} 


4759 


654 5 


784CIP2B_876 


8253 


1188 


2974 


4760 


6546 


784CIF2B_877 


8260 j 


1189 


2975 


4761 


6547 


784CIP2B_878 


8262 1 


1190 


2976 


4762 


654f 


784CIP2B_8 79 


8268 | 


1191 


297 V 


4763 


654 5- 


784CIF2B_8B0 


8270 


1192 


297*- 


4764 


655C 


784CIF2B_88l 


8272 


3193 


297.V 


4765 


6553 


784CIP2B_882 


82 74 


1194 


298C 


4766 


655^ 


784CIF2B_883 


B274 


1195 


2983 


4767 


6553 


7e4CIP2B_864 


B275 


1196 


2982 


4768 


6554 


784C1P2B_885 


8277 


1197 


298> 


4769 


655b 


784CIP2B_8B6 


8281 


1198 


2984 


4770 


6556 


784CIP2B_887 


8283 


1199 


298L 


4771 


6557 


784CIP2B_8B8 


8289 


1200 


298fe 


4772 


6556 


784CIP2B_889 


8295 


1201 


2987 


4773 


6553 


784C1P2B_890 


8300 


1202 


298E 


4774 


6560 


784CIP2B 891 


8303 


3203 


298S 


4775 


6563 


784CIP2B_892 


8304 


1204 


299C 


4776 


6561 


7 84CIP2B__8 93 


6305 


1205 


299j 


4777 


6563 


784C1P2B_894 


8309 


1206 


2997 


4/78 


6564 


784CIP2B_895 


8316 


1207 


299} 


4779 


656 1 


784CIP2B 896 


8319 


1208 


2994 


4780 


6560 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996. 


4782 


6568 


784CIP2B_899 


8323 


1211 


2997 


4783 


6569 


784CIP2B 900 


B325 


" 1212 


2996 


4784 


6570 


784CIP2B_901 


8331 


1213 


299^ 


4785 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784C3P2B_903 


8333 


1215 


300: 


4787 


6573 


784CIP2B_904 


8335 


1216 


300? 


4788 


6574 


784CIP2B 905 


8336 


1217 


300} 


4789 


6575 


784CIP2B 906 


8337 


1218 


3004 


4790 


657e 


784CIP2B_907 


8340 


1219 


3005 


4791 


6577 


784CIP2B_90B 


8343 


1220 


300( 


4792 


6576 


784CIP2B_909 


8347 


1221 


3007 


4793 


6579 . 


784CIP2B_910 


834S 


1222 


300t 


4794 


658C 


784CIP2B_911 


8351 


1223 


3009 


4795 


6583 


784CIP25_912 


8353 


1224 


3010 


4796 


6582 


784CIP2B 913 


8355 


1225 


3013 


4797 


6583 


784CIP2B_914 


8361 


1226 


3012 


4798 


6584 


784CIP2B_915 


8365 


1227 


301? 


4799 


6585 


784CIP2E_916 


8367 


1228 


3014 


4800 


6586 


784CIP2B_917 


8369 


1229 


3015 


4801 


6587 


784C1P2B_919 


8375 


1230 


301b 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


. 6585 


784CIP2B_921 


8391 


1232 


3016 


4804 


6590 


784CIP2B_922 


8393 


1233 


301? 


4805 


6593 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B 924 


8394 


1235 


3023 


4807 


6591- 


784CIP2B_9~25 


8395 


1236 


302; 


4808 


6594 


784CIP2B 926 


8396 


1237 


3023 


4809 


6595 


784CIP2B 927 


8398 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/5331? 



PCT/USOO/34263 



SEO It> NO: 


SEQ IV 


SEQ ID NOr 


SEO ID 


Priority 


SEO ID 


of full- 


NO : ol 


of con tic 


NO : 


docket number^ 


NO: in 


length 


IU1 i - 


nucleotide 


ot con tig 


corresponding 


U .S . S .N . 


nuc j cue x ucr 


•t C fly C fi 


secyuence 


peptide 


otV xLr i\U . in 


u?/ lot), 




secpuence 




Secruence 


priority 
at)t)l i cat" i on 




1236 


3024 


4810 


6596 


784CIP2R 926 


84 Oi 


1239 


3 025 


4 811 


6597 


7fi4ClP7R 92> 


84 02 


12 4 C 


3 02e 


4812 


6596 


784CIP2B 910 


84 05 


1241 


3027 


4813 


6599 


784CIP2B 931 


8406 


1242 


3028 


4814 


660C 


784CIP2B 932 


6409 


1243 


3029 


4815 


660« 


784CIP2B 933 


84)0 


1244 


3030 


4816 


6602 


784CIP2B 934 


8414 


1245 


3031 


4817 




7R4CIP2B 93 5 


84 15 


1246 


3032 


4818 


6604 


7R4CTP2B 936 


8419 


1247 


3033 


4 819 


cent 




84 2£ 


1246 


3034 


482C 


660b 


7R4CTP2B 93R 


84 3 C 


1245* 


3035 


4821 


6607 




R4 ' 


1250 


3036 


4 822 


DO VJ O 


/ O 4 * V-i-Jr 4JG__ U 


64^2 


12 51 


303 7 


a no 

S J 


DO 


/ o ^* 1 r £ JD_ 2> H J. 


04;-, 


1252 


3038 


£ R9/I 


Ob I U 


1 0**y — tr 40 3« <£ 


84 3 4 


12 5^ 


303 S 


A PO K 


bo A J 


f \5H\~Xir C7SJ 


(til "3 £ 


12 54 


3 040 




bOJL,^ 




C yl •} C 


12 55 


3041 


4 827 


66 1 j 




844- 


56 


3042 


4 828 


66 1 4 


/ o « L-Xi*^ b_ i» <i b 


84 5C 


12 57 


3043 


4 82 9 


6615 




QA Cl 1 

us 


1258 


304 4 


4 83 0 


66 16 


T n i /^T O O ji q 

/onLlr^iJ i»9 o 


64 52 


1259 


3 045 


4 83 1 


6617 


/84CIP2B_94i7 


84 60 


1 2 6 C 


3 04 6 


4 832 


661E 


784CIP2D_y5U 


84 6 1 


1261 


304 7 


4833 


6619 


784CIP2B 951 


84 62 




304 6 


4 834 


6620 


/04CIP2B_952 


6464 


1263 


3049 


4 835 


6621 


784CIP2B_953 


8465 


12 64 


3 050 


4 836 


6622 


784CIF2B_954 


84 6 7 


126 5 


3051 


4 837 


6623 


784CIP2B_955 


84 70 


1266 


3052 


4 838 


6624 


784CIP2B_956 


8473 


1267 


3053 


4 839 


6625 


784C1F2U_95 / 


84 73 




3054 


4840 


6626 


784CIP2B^95o 


8474 


1269 


3 055 


4841 


6627 


Ifl >l r*TTT^O DC ft 

/84ClP2B___9b9 


6475 


i o*7n 
/ u 


*i n t c 

JUDO 


4 842 


6628 


/b^ci^o you 


8476 


1271 


3057 


4 843 


662S 


/oSLlr^D ?bl 


8 4 8 0 ' 






4844 


6530 


7o4CIJP2B__yoZ 


848/ 


ion 


3059 


4 845 


6531 


/os LJLP^o 3bJ 




J.Z / 'i 


JudU 


4 846 




Iohk.L)?Zd Sbs 


ft A Q t. 


1275 


3061 








o/nr. 


1276 


3062 


/ flip 


bo JH 


/OHLlrZO 7ob 




1277 


3063 




00 


/ O *i V- J. tr&D__y O / 


84 94 


1276 


3064 


4 850 


6636 




8496 


1275* 


3065 


4851 


6637 


*7fliir , TDOn Q£Q 
/ O *i L X r^tS ?D7 


8497 


1280 • 


3066 


4852 


6636 


/Oil,! tr AD s 1 v 


84 99 


1281 


3067 


4 853 


6639 




8513 


1282 


3068 


4 854 


6640 


7 D 4f , TI?9H 972 


8522 


1283 


3069 


4 855" " 


6643 


7B4rii>?R 975 


8526 


1284 


3070 


4856 


6642 


7R4PTP5B 974 


8531 


1285 


3071 


4857 


6643 


7H4CIP2B 975 


8533 


1286 


3072 


4858 


6644 


784CIP2B 976 


8542 


1287 


3073 


48S9 


6645 


784CIP2B 977 


8544 


1288 


3074 


4860 


6646 


784CIP2B 978 


8565 


1289 


3075 


4861 


6647 


784CIP2B 979 


8565 


1290 


3076 


4862 


6646 


784CIP2B 980 


8572 


1292 


3077 


4863 


6649 


784CIP2B 981 


8576 


1292 


3076 


4864 


6650 


7fl40TP7B 9fl7 


8576 


1293 


3079 


4665 


6651 


784CIP2B_983 


8584 


1294 


3080 


4866 


6652 


784CIP2B_984 


8596 


1295 


3081 


4 867 


6653 


784CIP2B_985 


B602 


1296 


3082 


4868 


6654 


784CIP2B_986 


8604 


1297 


3083 


4869 


6655 


784CIP2B_987 


8605 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 


6657 


784CIP2B 989 


8637 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/5331? 



PCT/US0O/34263 



SEO ID NO: 


SEO IH 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: oi 


of contio 


NO: 


docket number_ 


NO: in 


lencth 


ful}- 


nucleotide 


of contig 


correspond; nc 


U.S. S.N. 


nucleotide 


iength 


• sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence j 
1 


peptide 
sequence 




sequence 


priority 
application 




130C | 


3086 


4872 


6658 


704CIP2B_ 990 


8640 


1303 


3067 


4B72 


6659 


784CIP2B_ 991 


8643 


1302 


3086 


4874 


6660 


784CIP2B_ 992 


8645 


1302 


3089 


4875 


6661 


784CIP2B 993 


8650 


1304 \ 


3090 


4 876 


6662 


784CIP2B_994 1 


8651 


130S 


3051 


4877 


6663 


784CIP2B_995 


8654 


13 06 


3052 


4876 


6664 


784CIP2B 596 


8655 


1307 


3093 


4879 


6665 


784CIP2B 997 


8657 


1306 


3094 


4860 


6666 


784CIP2B_ 998 


8665 


1305 i 


30S5 


48ei 


6667 


784CIP2B_999 


8668 


13 1C 


3096 


4862 


6668 


784CIP2B_1000 


8671 


13 IX 


3097 


4863 


6669 


784CIP2B_ 1001 


8672 


1312 


3096 


4864 


6670 


784CIP2B1002 


8692 


1313 


3095 


486E 


6671 


784CIP23 1003 


87C6 


1314 


310C 


4866 


6672 


784CIP231004 


8716 


1315 


3101 


4867 


6673 


784CIP2B_100S 


8729 


1316 


3102 


4 8 86 


6674 


784C1P2B_1006 


8743 


131? 


3103 


4885 


6675 


784CIP2B_1007 


B764 


1316 


3104 


4 890 


6676 


764C1P2B_100B 


8764 


1319 


310b 


4893 


6577 


784C2P2B 1009 


8764 


1320 


3106 


4 8 5* 


6678 


784CIP2B1010 


8774 


2321 


3107 


4 893 


6679 


784C1P2B_1C11 


8782 


1322 


3106 


4854 


6680 


784CIP2B_ 1012 


8796 


1323 


310S 


4B5b 


6681 


784CIP2B_1013 


8827 


1324 


311C 


489£ 


6682 


784CIP2B_1014 


8842 


132S 


3111 


485/ 


6683 


784CIP2B_1015 


8842 


1326 


3112 


4896 


6684 


784CIP2B_1016 


8858 


1327 


3113 


4895 


6685 


7B4CIP2B_1017 


8871 


1328 


3114 


4900 


6686 


784CIP2B_1016 


8921 


2325 


311$ 


4903 


6687 


784CIP2B_ 1019 


8927 


1330 


3126 


4902 


6688 


7B4CIP2B_1020 


8942 


3231 


3117 


4903 


6689 


7B4CIP2B_1021 


8 994 


1332 


3126 


4904 


6690 


784CIP2B 1022 


9023 


1333 


3115 


4906 


6691 


784CIP2B_1023 


9028 


1334 


3220 


490 6 


6692 


784 CIP2B_ 1024 


9058 


1335 


3121 


4907 


6693 


784CIP2B_1C25 


9058 


1236 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4905 


6695 


784CIP2B_1027 


9079 


1336 


3124 


4910 


6696 


784CIP2B_1C28 


9082 


1335 


3125 


4913 


6697 


784CIP2B_1029 


9084 


1340 


3126 


4912 


6698 


784CIP2B^1030 


9093 


1341 


3127 


4911- 


6699 


784CIP2B_1031 


9101 


1342 


3126 


4914 


6700 


784CIP2B_1032 


9103 


1343 


3125 


4915 


6701 


784CIP2B_1033 


9105 


1344 


3130 


491C 


6702 


784CIP2B 1034 


9151 


1345 


3131 


491? 


6703 


784CIP2B 1035 


9161 


1346 


3132 


4926 


6704 


784CIP2B_1036 


9172 


134? 


3133 


4916 


6705 


784CIP2B_1037 


9174 


1348 


3134 


4920 


6706 


784CIP2B_ 1038 


9204 


1345 


3135 


4923 


6707 


784CIP2B_1039 


9234 


1350 


3136 


4922 


6708 


784CIP2B_1040 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 


1352 


3138 


4924 


6710 


784CIP2B_1042 


9256 


1253 


3135 


4925 


6711 


784CIP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


S345 


1355 


3141 


4 92? 


6713 


7 84CIP2BJ1045 


9379 


1356 


3142 


4926 


6714 


784C1P2E_104o 




1357 


3143 


4926 


6715 


784CIP2B_104 7 


9437 


1356 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1355 


3145 


493j 


6717 


764CIP2B_1049 


9500 


1360 


3146 


4932 


6718 


I 784CIP2BJL050 


9502 \ 


1361 


3147 


^933 


6719 


j 784CIP2B_2 051 


9520 
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BNSDOCID: <WO__0153312A1_L> 



WO 01/53312 



PCT/USOO/34263 



SEO ID NO: 


SEQ IE 


SEO 12 NO: 


SEQ ID 


Priority 


SEO ID 


of fuii- 


NO: of 


of con tic 


NO: 


docket number_ 


NO:in 


length 


full- 


nucitrotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 




peptide 


SEO ID NO: in 


09/48B, 725 


sequence 


peptide 
sequence 


seci>.nce j 


sequence 


priority 
application 




136; 


3146 


4 534 | 


672C 


784CIP2B 1052" 


9541 


13 6:- 


3145 


4535 


6721 


784CIP2B_1053 


9541 


1364 


315C 


4936 


6722 


784CIP2B_1054 


9548 


1365 


3151 


4 53 7 


6723 


784CIP2B_1055 


95S6 


1366 


3152 


4938 


6724 


784CIP2B_1056 


9556 


1367 


3153 


4929 


6725 


784CIP2B 10S7 


9575 


1366 


3154 


4 54 0 


6726 


784CIP2B_1058 


9589 


1365 


3155 


4541 


6727 


784CIP2B_3 059 


9599 


137C 


3156 


4542 


6728 


784CIP2B_1060 


9602 


1373 


3157 


4 943 


6729 


784CIP2B_1061 


9606 


1372 


3156 


4544 


6730 


784CIP2E_1062 


9622 


1373 


3159 


4 54 5 


6731 


784C1P2B 1063 


9623 


1374 


3160 


4546 


6732 


784CIP2B_1064 


9646 


137b 


3161 


4547 


6733 


784CIP2B1065 


9747 


1376 


31S2 


4 548 


6734 


784CIP2E_1066 


9773 


1377 


3163 


4945 


6735 


784CIP2E1067 


9785 


137f 


3164 


C5SC 


6736 


784CIP2E_1068 


9801 


1375 


316b 


1951 


6737 


784CIP2B_1069 


9811 


i3eo 


3166 


4 552 


6738 


784CIP2B_1070 


9643 


i3ea 


3167 


"4 553 


6739 


784CIP2B_!071 


9854 


1382 


3168 


4 554 


6740 


784CIP2B_ 1072 


9854 


13B3 


3169 


4955 


6741 


784CIF2B_1073 


5864 


1384 


3170 


4956 


6742 


784CIP2B 3074 


5864 


138b 


3171 


4557 


674 3 


784CIP2B_1075 


9871 


13B€ 


3172 


4556 


6744 


784CIP2B_1076 


9879 


13 87 


3173 


4559 


6745 


784CIP2BI077 


9881 


1386 


3174 


4560 


6746 


784CIP2B1078 


9885 


1385 


3175 


4561 


6747 


764CIP2B_1079 


9901 


1390 


3176 


4962 


674 8 


784CIP2B_1060 


9912 


1351 


3177 


4963 


6749 


784CIP2E_3081 


9916 


1392 


3178 


4564 


6750 


784CIP2B1082 


9921 


1393 


3179 


4965 


6751 


784CIP2B1083 


9925 


1394 


3180 


4 966 


6752 


7B4CIP2B_3 0B4 


9930 


139S 


3181 


4567 


6753 


784CIP2B_1085 


9549 


1396 


3182 


4S66 


6754 


784CIP2B1086 


9951 


1397 


3183 


4565 


6755 


784CIP2B_ 1087 


9559 


1396 


3184 


4570 


6756 


784CIP2B_1088 


9973 


13 5 5 


3185 


4971 


6757 


7B4C1P2B_108 9 


9582 


1400 


3186 


4972 


6756 


784CIP2B1090 


9994 


1401 


. 3187 


4973 


6755 


784CIP2B_1091 


10021 


1402 


3188 


4574 


5760 


784CIP23_1092 


10041 


1403 


3189 


4 975 


6761 


784CIP2B1094 


10067 


1404 


3190 


4576 


6762 


784C1P2B_1095 


10073 


1405 


3191 


4577 


6763 


784CIP2B1096 


10X12 


1406 


3192 


4 576 


6764 


784C1P2B_1097 


10117 


1407 


3193 


4979 


6765 


784CIP2B1098 


10132 


1408 


3194 


458G 


6766 


784CIP2B_1099 


10165 


1409 


3195 


4961 


6767 


784CIP2B1100 


10217 


1410 


3196 


4562 


6768 


784CIP2B_1101 


10226 


1431 


3197 


4563 


6769 


784ClP2B_2102 


10232 


1412 


3198 


4584 


6770 


784CIP2B_1103 


10237 


1413 


3199 


496 5 


6771 


784CIP2B_1104 


10279 


1414 


3200 


4986 


6772 


784CIP2C_1 


33 


1415 


3201 


4967 


6773 


784CIP2C 2 


271 


1416 


3202 


45K8 


6774 


784CIP2C3 


848 


1417 


3203 


4989 


6775 


784CIP2C_4 


849 


1416 


3204 


4990 


6776 


784CIP2C 5 


864 


1415 


3205 


4551 


6777 


784C1P2C_6 


953 


142C 


3206 


4 9 92 


6778 


784CIP2C_7 


980 


j 1423 


3207 


495? 


6779 


784CIP2C 8 


1595 


j 1422 


3208 


4954 


6780 


784CIP2C_5 


1697 


| 1423 


3209 


4555 


6781 


784CIP2C_10 


1744 
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BNSDOCID: <WO 0153312A1J_= 



WO 01/53312 



PCT/U SOU/34263 



SBQ ID NO: 


SEQ ID 


SEv ID NO: 


SEQ ID 


Priority 


SEQ ID 


o£ full- 


NO: of 


of: contio 


NO: 


docket number^ 


NO : in 


lencth 


full- 


nucleotide 


of contig I corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




' 1424 


3210 


4996 


6782 


—10/ l"»7 DO/* 1 1 T 


1937 


1425 


3211 


499'/ 


6763 


784 CirZL_l^ 


1955 


1426 


3212 


4998 


6784 


784 CltVC__i J 


1955 


142? 


3213 


4595 


6785 


t o 4 L 1 r^d L 14 


2185 


1426 


3214 


5000 


6786 


784CIF-2t_15 


2889 


1429 


3215 


5002 


6787 


7o4CIP2C_16 


2901 


143C 


3216 


5002 


6788 


764 CI P2C_1 / 


2902 


1431 


3217 


5003 


6789 


764C1P2C__18 


2905 


1432 


3218 


5004 


6790 


784 CIP2C_19 


2946 


1433 


3219 


5005 


6791 


784CIP2C_20 


2956 


1434 


3220 


5006 


6792 


764CIP2C21 


2959 


143S 


3221 


5007 


6793 


764CIP2C_22 


2965 ; 


1436 


3222 


5008 


6794 


784 CI P2C_23 


2966 


14 3 7 


3223 


5009 


6795 


764CIP2C 24 


2970 


1430 


3224 


5010 


6796 


784CI?2C_25 


2585 


1439 


3225 


5011 


6797 


784CIP2C_26 


2987 


1440 


3226 


5012 


6758 


784CIP2C 27 


2993 


1441 


3227 


5013 


6795 


7 84CIP2CJ28 


2S93 


1442 


3228 


5014 


6800 


784CIP2C_29 


3017 


1443 


3229 


5015 


6801 


7H4CIP2C 30 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


144$ 


3231 


5017 


6803 


784CIP2C_32 


3357 


1446 


3232 


5018 


6804 


764CIP2C33 


3359 


1447 


3233 


5019 


6805 


7B4CIP2C_34 


3432 


1448 


3234 


5020 


6806 


784C1P2C_35 


3438 


1449 


3235 


5021 


6807 


784CIP2C36 


3439 


1450 


3236 


5022 


6B08 


784CIP2C_39 


3463 


1451 


3237 


5023 


6B09 


784C1P2C_40 


3466 


1452 


3238 


5024 


6310 


784CIP2C 41 


3466 


14 53 


3239 


5025 


6311 


784C1P2C 42 


346*7 


1454 


3240 


5026 


6912 


784C1P2C_43 


346b 


1455 


3241 


5027 


6813 


784CIP2C_44 


34 83 


1456 


3242 


5028 


6814 


784C1P2C 45 


3484 


1457 


3243 


5 025 


6815 


7 84C1P2C_46 


3486 


1458 


3244 


5030 


6816 


784C1P2C_4 7 


3491 


145S 


324S 


5031 


6817 


784C3P2C_48 


3493 


1460 


3246 


5032 


6818 


7 84 CI P2C_ 4 9 


3494 


1461 


3247 


503 3 


6819 


784C1P2C_50 


3495 


1462 


3248 


5 034 


6820 


784C1P2C_51 


3496 


1463 


3249 


5035 


6821 


7B4CIP2C_52 


3503 


1464 


3250 


5036 


6822 


784C1P2C_53 


3503 


1465 


3251 


5037 


6823 


7 84CIP2C_54 


3504 


1466 


3252 


5038 


6824 


784CIP2C_55 


3511 


1467 


3253 


5039 


6825 


7 84C1P2C_5.6 


3531 


1468 


3254 


5040 


6826 


784CIP2C_S7 


3536 


1465 


3255 


5041 


6 827 


7 84CIi'2C_58 


3546 


1470 


3256 


5042 


6 826 


/84C1P2C 55* 


3548 


1471 


325*? 


5043 


6829 




3551 


1472 


3258 


5044 


6830 


/04Lir<!V, DA 


3553 


1473 


3255 


5045 


6831 




3564 


1474 


3260 


5046 


6832 


7b4ClP^C bS 


3567 


147S 


3261 


5047 


6833 


7 B4CIP2C_64 


3572 


1476 


3262 


5048 


6834 


/o4LIF«:C bb 


3573 


1477 


3263 


5049 


6835 


7HflPTP7r 66 

/ O *t \_ A. IT <£ OU 


3574 


1476 


3264 


5050 


6836 


784CIF2C 67 


3583 


147S 


3265 


5051 


6837 


784CIF2C_68 


3615 


14 80 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


7B4CIP2CJ70 


3629 


1482 


3266 


5054 


6640 


784CIP2C_71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6842 


784CIP2C_73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 
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BNSDOCID: <W0 0153312A1_1_> 



WO 01/53312 



PCT/US<m/342f>3 



SEQ ID NO: 


SEQ ID 


SEO ID NO: 


SEQ ID 


Priority 


SEQ ID ; 


of fulj- 


NO: of 


c; ccntic 


NO: 


docket number^ 


NO: in } 


iength 


full - 


nucleotide 


of contig 


corresponding 


U . S . S . N . j 


nucleotide 


length 


sequence 


peptide 


SEO iD Nu: in 


no J a a o *70 c 


sequence 


pept ide 
sequence 




seouence 


pr i or i ty 




.- 

I4 86 


3 2 72 




O OH S 




3 9*24 


"! A Q7 
X't 0 i 




5059 


684 5 


7R4nP7T 76 


3928 


148B 




506C 


684 6 


7B4C"^P?r 77 


3935 


1« DO 


•a o "7 c 


DUD J 


684 7 


1UAC1V0C 7fi 


3959 


1 a on 


J Z /O 


^ n<;o 


CO AD 
o OH O 


/ OH \, i. r c\- 1 -/ 


3981 


14 91 


0 OOO 


DUO J 


HQ AC 


7fi4r"P5r* fir* 


3989 


1 A Q 0 
Xfl 




cn£/i 


COCA 
D03U 




4295 


14 93 


3 279 




DODX 




4300 




o <cou 


jUD b 


COCO 


men pop 


4360 


. 1495 


3 2 81 


5067 


6853 


ICACx 13*5'"* Q/l 
/ V. 1 f/i- Of* 




1496 


3 282 


5066 


0034 


To/inDTP tic, 
/ e 'i t_ J r<2L.^or? 


fl 17 T 
1j I X 


1497 


3283 


5065 


6855 


fOH^X ^2y_ DO 


jt 111 


1496 


3284 


5070 


6856 






1499 


32 85 


5071 


6857 


TOAnDT,' 1 QQ 

/b4t»lPzu by 


4378 


150C 


3 286 


5072 


6858 


/84C J PZL_i#U 


4382 


1503 


32 87 


5073 


6859 


/84vJP2L__yi 


44 09 


• 1502 


3288 


5074 


6860 


784C1P2C_SZ 


4421 


1S03 


3289 


5075 


6861 


764CIP2C_93 


4421 


1504 


3290 


5076 


6862 


7B4CJ P2L_94 


4426 ! 


1505 


3291 


5077 


6863 


/84CJ P2C_ 95 


4430 


1506 


3292 


5078 


6864 


784C1P2C_96 


4435 


1507 


3293 


5079 


6B65 


784C1P2C_97 


4436 


i 1508 


3294 


5080 


6866 


784C1P2C_98 


443 9 


| 1509 


3295 


5081 


6867 


784CIP2C_9S 


4440 


| 1510 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6B69 


784C1P2C_101 


4442 


1512 


3298 


5084 


6870 


784CIP2C_102 


4455 


1513 


3299 


S085 


6971 


784CIP2C_103 


4462 


1514 


3300 


5086 


6872 


784CI P2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6974 


784CIP2C_106 


4477 


1517 


3303 


5089 


6B75 


784CIP2C_1G7 


44 81 


1518 


3 3 04 


5090 


6376 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


7B4C1P2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


687S 


784C1P2C_111 


4490 


1522 


3308 


5094 


6880 


764CIP2C_112 


4499 


1523 


330S 


5095 


6881 


7B4CIP2C_113 


4502 


1524 


3310 


5096 


6882 


784CIP2C_114 


4506 


1525 


3311 


5097 


6883 


784CIP2C_1 15 


4505 


1526 


3312 


5098 


6884 


TO jl PT hop 1*1^ 

784C1 P2t,_llo 


4514 


1527 


3313 


5099 


6885 


/B4LI P2C_11 / 


H C>XC 


1528 


3314 


\ 5100 


6886 


/ b 4 \-X tr Z ^-^X X 0 


ACT"; 


1529 


3315 


5101 


6887 


/CfiLlF^l. -11? 


d^7t 
ft 5Z- 


1530 


331 6 


5102 


6 888 


TOirTaor ion 

/ O H ^ 1 f *IV-__1 ^ U 


4527 


XO i± 


OOI 7 


1 5103 


6 8 85 




4528 


1 CIO 


■5"J"| ft 
J J J O 


5104 


D03U 


/ 0 H y~ 1 r C ^ 1 z x 


452S 


1 C-l-l 
XD J J 


Q 

J-Ji? 


Pint; 


6 8 91 


/DILI, i v. l^J 


4532 


X?3f« 




5106 


b o y z 




4537 


it-ic 

XOJO 


JO d. X 


t;i m 

31U / 






- 1 4536 










7fi/irT DOP 1 Ofi 


4551 


1537 


3323 


5109 


6 895 


177 


4552 




HO £ 


DXXU 


CpftC 


r 7R4r~PPP 17R 
'0 4' liO 


4559 




■inc. 


r. i i t 
ollX 


boj / 


TBarTO" TOQ 
/ 0 S V- JL r c <^__ 


4567 






5H2 


6890 


/ 0 4 LI Jr^L.__X J w 


4 568 


{ 1541 


3327 


5113 


6893 


784CIP2C 132 


4585 


15"42" " 


3328 


5114 


6900 


784C1P2C^133 


4592 


1543 


3329 


5115 


6901 


764CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C,135 


4616 


1545 


3331 


5117 


6903 


784CIP2C_136 


4617 


1546 


3332 


5118 


6904 


784CIP2C__137 


4618 


1547 


3333 


5119 


6905 


784C1P2C_138 


4620 
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BNSDOCID: <WO 0153312AJJ_> 



WO01/53312 



PCT/USOO/34263 



SEQ ID NO: 


SEQ JC 


SS0 ID NO: 


SEQ 11 


Priority 


SE0 ID 


of full- 


NO : oi 


o£ ccnticr 


NO : 


docket number 


NO : i n 


length 


TVl J. - 


nuc3 eot ide 


of. ccntic 


ccrr e soono-z. ncj 


U.S S.N. 




1 cno t \t 






cfo ID NO- in 

hI>i-V A I^V * -III 


09/4 88,725 


sequence 


tieiot" \ At 
seouence 




sequence 


priority 
appli cation 




2548 


3334 


5120 


690( 


784CIP2C 139 


4624 i 


1549 


3335 


5121 


690'. 


784C1P2C_140 


463^ 


1550 


333t 


5122 


690c 


784C1P2C 141 


4634 


1551 


3337 


5123 


690S 


784CIP2C_142 


463f 


1552 


333"f 


5124 


69K 


784CIP2C 143 


4635 


1553 


3339 


5125 


6911 


784CJP2C_144 


4642 


1554 


3340 


5126 


6912 


784CIP2C__145 


464 4 ! 


1555 


334: 


5127 


692 5 


784CIP2C 146 


4655 1 
1 


1556 


3342 


5128 


6914 


784CIP2C 147 


466c : 


2 557 


334.1 


5125 


691 5 


784CIP2C 148 


4677 


2558 


3344 


5130 


691c 


784CIP2C 149 


4677 ; 


1559 


334 L 


• 5231 


G917 


784CIP2C 150 


4677 


1560 


3 34 6 


5132 


691 c 


784CIF2C 152 


4 682 


1561 


3347 


5133 


6915 


784CIP2C 153 


4690 


1562 


334 6 


5134 


6920 


784CIP2C 154 


4692 


1563 


334 9 


5135 


6922 


784C1P2C 155 


4727 


1564 


3350 


5136 


692/ 


78.4P.IP7r 1 S6 


473 0 j 


1 S65 


j jjj 


5137 


692'*- 


7H4CIP2C 157 


4734 | 


1S66 


335/ 


5 13 8 


6924 


1 D4 UXJr« v> I3u 


4757 ' 


2 567 


^35^ 


5139 


692 1 


7R4PTP9P 159 


4764 


! boo 




C 1 fl 


6 92 6 


7 fl APT POP 1 fifl 


4786 


looy 


*3 ci c. 


5141 


C Q7 


JCivirt \^ JL D J. 


47 92 


15/0 


3356 




6 92 1 


JOtLJ.l'ilt J. O <C 


482£ 


.to /JL 


"3 *i C 7 
J J D / 


5243 


6 92 c 


7ftfl PTDOr 1 £^ 


4826 


1 C*7*} 

±o I £ 




5144 




TftiPTC^r 1 Cfl 


4 850 


1573 


3359 


5145 


6931 


/ 0 fl L- J.r^ ^- J. b 3 


4853 


1574 


3360 


5146 


6932 


/ 0 fi l. 1. y v~ 100 


4 8 5 C 


1575 


3362 


5247 


6933 


iha r*~t do/** ici 

/ OS ( — ti / «£V,__ ib / 


4 856 


1576 




514 8 


6 93 4 


/ 0 *i Jlrcv Abo 


4 867 


1577 


3362 


5149 




/ 0 fJ v. AJr^ J. b j* 


4 669 


1578 


3364 


5150 


6 93t 


7 u a P T u*3 r* n "/ n 


487P 


1579 


3365 


5151 


6 93 y 


/ O^UIF^L J. / X 


4 880 




33 6 f 


^ 1 52 


6935 


1 / 0 *f V, 4. IT \r A f 6 


4942 


1581 


J j b / 




O jjX 


7R4PTP"?P -\ 11 


4945 


1582 


33 6& 


5154 


6940 


7 ft /i P T D"5 P 1 7 fl 


4950 


1583 


J J b Jj 


5155 


•C Qfl 1 


70flPTP*7P 17=,- 


4 952 


1 584 




5 JLDO 


6 94 2 


iaAC7V?C 1 7fi 


4954 


1CQC 
X ODD 


JJ / J 




fi Q L - 


7ft4PTP^P 1 7*7 


4958 




•7 ^ *7 > 




6944 


7RflPtPt>P 1 7R 


4961 


ISO/ 


■3-3TJ 


SI 59 


694 5 


7H4rTP9C 179 


5590 


■ " 2588 


3374 


5160 


6946 


764C1P2C 180 


5599 


2589 


3375 


5161 


6947 


784CIP2C 181 


5692 


1590 


3376 


5162 


6 94fc 


7S4CIP2C 182 


5732 


1591 


3377 


5163 


6945 


784CIP2C 183 


5765 


1592 


3378 


5164 


6950 


784CIP2C 184 


5772 


1593 


3375 


5165 


6952 


784CIP2C 2 85 


5774 


1594 


33 8C 


5166 


695^ 


784CIP2C 186 


579? 


1595 


3382 


5167 


6953 


784CIP2C 187 


5806 


1596 


3382 


1 5168 


6954 


784CIP2C 188 


5852 


1597 


338? 


5169 


6955 


764CIP2C 189 


5852 


1598 


3384 


5170 


6956 


784CIP2C 190 


6057 


1599 


338b 


5171 


6957 


784CIP2C 191 


6062 


1600 


3386 


5172 


695E 


784CIP2C 192 


6105 


1601 


3387 


5173 


6955 


784CIP2C 193 


6160 


2602 


3386 


5174 


696C 


784CIP2C 194 


6297 


1603 


3369 


5175 


6961 


784CIP2C_195 


6396 


1604 


3390 


5176 


6962 


784CIP2C 196 


6398 


1605 


3392 


5177 


6953 


784CIP2C_197 


6415 


1606 


3392 


5178 


6964 


784CIP2C 198 


6446 


2607 


3393 


5279 


6965 


784CIP2C_199 


6465 


1608 


3394 


5180 


6966 


784CIP2C 200 


6476 


1609 


33S5 


5181 


696"/ 


784CIP2C_201 


6561 
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BNSDOCID: <WO 0lb33l2AlJ_> 



WO 01/53312 



SEQ ID NO: 


SEQ ir. 


SEQ ID NO: 


SEQ ID | 


Priority 


SEQ ID | 


of full- 


NO: oi 


of contic 


NO: 1 


docket number_ 


NO: in | 


length 


full . 


nucleotide 


of contig 


corresponding 


U.S. S.N. j 


nucleotide 


length 


sequence 


peptide 1 


SEQ ID NO: in 


09/488,726 j 


sequence 


peptide 
sequence 




sequence 


priority 
application 


! 


161C 


335e 


5182 


6968 


784 CI P2C__2 \>£ 


6574 


1611 


33S7 


5183 


6969 


784 CI P2C_2U3 


657 6 | 


1612 


339*- 


5184 


6970 






1613 


3399 


5185 


6971 




DO I £ 


1614 


34 00 


5186 


6972 


Ion. CI P2C_Z Ub 




1615 


3401 


5187 


6973 


I OH CI P2C_^U / 


ccoc 
o b ys 


1616 


34CS 


5188 


6974 


784 CI P2C^_2 Qo 


b /*lfc 


161*7 


34 01- 


5189 


6975 


764CIP2C 209 


6896 


1618 


3404 


5190 


6976 


7b4CI?2C — 210 


6938 


1619 


34Cb 


5191 


6977 


7fi4CIP2C__211 


6943 


1620 


340e 


5192 


6978 


784CIP2C^212 


711C 


1621 


3407 


5193 


6910 


784CIP2C_2x3 


7200 


1622 


340e 


5194 


6980 


784CI?2C_214 


7212 


1623 


3405- 


5195 


6981 


784CIP2C 215 


721 H 


1624 


341' 


5196 


6982 


784CIP2C_216 


7249 


162S 


3411 


5197 


6983 


784CIP2C_217 


7500 


1626 


341i 


5198 


6984 


784CIP2C_218 


7509 ■ 


1627 


3412 


5199 


6985 


784CI?2C_219 


7523 


1628 


3414 


5200 


6986 


784CIP2C_220 


7544 


1629 


341!: 


5201 


6987 


784CIP2C_221 


7564 


163C 


341t 


5202 


6988 


784C1P2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 | 


1632 


3411 


5204 


6990 


784CIP2C_224 


7B13 j 


1633 


341S 


5205 


6991 


784CIP2C_22S 


7831 ; 


1634 


342C 


5206 


6992 


784C1P2C_226 


7843 | 


1635 


3423 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_228 


7943 


1637 


3 421- 


S209 


6995 


784CIP2C_229 


8175 


163E 


3424 


5210 


6996 


784CIP2C_230 


8216 j 


163S 


342h 


5211 


6997 


784CIP2C_231 


8225 | 


1640 


342fc 


5212 


6998 


784CIP2C_232 


82 71 I 


1641 


3 4 27 


5213 


6999 


784CIP2C 233 


8397 


1642 


3426 


52Z4 


7000 


* 784CIP2C_234 


8466 


1643 


342* 


5215 


7001 


784CIP2C_235 


8503 


1644 


3 4 3G 


5216 


7002 


784CIP2C_236 


8953 


1645 


343a 


5217 


70 03 


i 784C1P2C_237 


9106 


1646 


343i- 


5218 


7004 


; 784C1P2C_238 


9139 


1647 


3433 


5219 


7005 


784CIP2C_239 


9555 


1648 


3 4 34 


5220 


7006 


1 784CIP2C_240 


9650 


1649 


343? 


5221 


7007 


784C1P2C_241 


9889 


1650 


3436 


5222 


7008 


784C1P2C_242 


9933 


1651 


3437 


5223 


7009 


i 784C1P2C_243 


9953 


1652 


3436 


5224 


7010 


784CIP2C_244 


9381 


1653 


3435 


5225 


7011 


784CJP2D_1 


74 6 f 


1654 


3440 


5226 


7012 


7 84 C J P2 D_2 


3558 


1655 


344j 


5227 


7013 


784C1P2D_J 




1656 


344/ 


5228 


7014 


784C1P2D_4 


JO J J 


1657 


3443 


5225 


7015 


784C1P2D_5 


J ODD 


1658 


3444 


5230 


7016 


j /84ClP2U_b 


" " 7*719 '; 
J / J' 


1659 


3445 


5231 


7C17 


/84CIP2LI ( 




1660 


3446 


5232 


7018 


Toflnnor\ n 

784ClP2D_o 


A1 (\(\ 1 


1661 


3447 


5233 


7019 


784C1P2D_5* 


4703 


1662 


3 44 6 


5234 


7020 


784CIP2D 10 




1663 


3449 


5235 


7021 


/84CIP2D_11 




1664 


34 5C 


5236 


7022 


784CIP2D_12 




X0O3 


34 5} 


5237 


7023 


/o4 CI V£U^± J 


5159 


1666 


3452 


5238 


7024 


784CIP2D_14 


7443 


1667 


3 453 


5239 


7025 


784CIP2D 15 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784CIP2DJL7 


8727 


1670 


345fc 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D_19 


8756 
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BNSDOCID: <W0 0153312A1J_> 
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SEQ ID WO: 


SEQ ID 


SEQ ID NO: 




SEQ ID 


Priority 


SEQ lb i 


of full- 


NO: c: 


of contic 




NO: 


docket number^ 


NO:ir. 


length 


fulj- 


nucleotide 




of con tig 


corresponding 


U.S.S.K. 


nucleotide 


lengt:. 


sequence 




peptide 


SEQ ID NO: in 


09/486, 725 


sequence 


peptide 
sequence 






sequence 


priority- 
application 




167: 


345E 


5244 


7030 


784C3P2D 20 


8018 


16 73 


3459 


524i 


7031 


784CIP2D_21 


8844 


1674 


34 60 


5246 




7032 


784C1P2D_22 


6646 


167!- 


3461 


524 7 




7032 


784CIP2D_23 


8912 


1676 


3462 


5246 




7034 


784CIP2D 24 


89ie 


1677 


3462 


5245 




703E 


784C3P2D_25 


8918 


1676 


34 64 


5250 


7036 


784C1P2D_26 


8941 


1675 


3465 


5251 


7037 


784CIP2D_27 


BS4I 


168C 


34 6 6 


5252 


7038 


784CIP2D_28 


8 951 


168] 


3467 


5253 


7035 


784CIP2D_29 


8551 


1662 


3466 


5254 


7040 


784CIP2D_30 


9007 


1681- 


3465 


5255 


7041 


784C3P2D_31 


9012 


1684 


3470 


5256 


7042 


784CIP2D_32 


9013 


1665 


3471 


5257 


7043 


784CIP2D_33 


9025 


16e* 


34 72 


525e 


7044 


784CIP2D_34 


9053 


16B7 


3.473 


5259 


7045 


784C1P2D_35 


9054 


16 BP- 


3474 


5260 


7046 


7 84CIP2D_36 


9054 


16 8! r 


34 75 


5261 


7047 


704CIP2D_37 


9112 


1691 


347C 


5262 


7048 


784CIP2D_38 


9134 


1691 


3 4 77 


5263 


7049 


784CIP2D_3 9 


9152 


169/ 


3476 


5264 


7050 


784CIP2D_40 


9152 


1693 


3479 


5265 


705a 


784CIP2D_41 


9211 


1691 


3 4 80 


5266 


7052 


784CIP2D_42 


9223 


169t 


3483 


5267 


7053 


784CIP2D_43 


9223 


1696 


3482. 


5266 


7054 


784CIF2D_44 


S231 


1697 


3483 


5265 


7055 


784CIP2D_45 


9236 


169& 


3484 


5270 


705C 


784CIP2D_46 


5236 


1691 


348£ 


5271 


7057 


784CIF2D 47 


9303 


170C 


3486 


527i 


7056 


784CIP2D_4 8 


9309 


1703 


3 4B7 


5273 


7059 


784C1P2D 49 


9314 


1702 


3486 


5274 


7060 


784CIF2D_50 


9326 


1703 


3489 


5275 


7061 


784CIP2D_51 


9339 


1704 


3490 


5276 


7062 


784CIP2D_52 


9346 


170b 


3491 


5277 


7063 


784CIP2D_53~ 


5376 


1706 


3492 


5278 


7064 


784CIP2D54 


j_ 9382 


1707 


3492 


5275 


7065 


784CIP2D_55 


54 07 


1706 


3494 


5280 


7066 


784CIP2D_56 


5414 


1709 


3495 


5281 


7067 


784CIP2D_57 


9439 


1710 


" 3496 


5282 


7068 


784CIP2D_58 


54 85 


1711 


34 97 


5283 


7065 


784CIP2D_59 


94 52 


1712 


3496 


5284 


7070 


784CIP2D_60 


9501 


1713 


3499 


5285 


7073 


784CIP2D_6l 


9526 


1714 


3500 


5286 




7072 


784CIP2D_62 


9526 


1715 


3 503 


5287 




7073 


784CIP2D_63 


9551 


1716 


3502 


5288 


7074 


784CI?2D_64 


9557 


1717 


3503 


5289 


7075 


784CIP2D_6 5 


9566 


1716 


3504 


5290 




7076 


784CI?2D_66 


9566 


1719 


3505 


5291 




7077 


784CI?2D_67 


9557 


1720 


3506 


5292 


7078 


784CIP2D_68 


9615 


1721 


3507 


5293 


7079 


784CIP2D 69 


9626 


1722 


3506 


5294 


7080 


784CIP2D_70 


9649 


1723 


350? 


5295 


7081 


784CIP2D 71 


9652 


1724 


351C 


5296 


7082 


784CIP2D_72 


5660 


1725 


3511 


5297 


7083 


784CIP2D_73 


9662 


1726 


3512 


5298 


7084 


784CIP2D_74 


9725 


1727 


3513 


5299 


7085 


784CIP2D_75 


9746 


1728 


3514 


5300 


7086 


7B4CIP2D_ lb 


Q "7 "7 "7 
Jill 


1729 


3515 


5301 


7087 


784CIP2DJ77 


9787 


173 0 


3516 


5302 


7088 


784CIP2D_78 


979C 


1731 


3517 


5303 


7089 


784CIP2D_75 


9842 


1732 


351B 


5304 




7090 


784CIP2D_80 


9642 


1733 


3515 


5305 




7091 


784CIP2D_81 


9846 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contic 


NO: 


docket nwnber_ 


NO:ir. 


lenath 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1734 


3520 


S30< 


7092 


784CIP2D_62 


9867 


1735 


3521 


5307 


7093 


7B4C3P2D_83 


, 10010 


1736 


3S22 


5308 


7094 


784C1P2D_84 


10011 


1737 


3523 


5305- 


7095 


7B4C1P2D85 


10052 


1738 


3524 


53 it- 


7096 


784CIP2D 86 


10057 


1735 


3525 


531- 


7097 


784CIP2D_87 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10135 


1741 
~ < 


3527 


53i:« 


7099 


784CIP2D_90 


10142 


1742 


3528 


5334 


710C 


784CIP2D_92 


1016^ 


1743 


3529 


531 1 


7101 


784CIP2D 92 


10173 


1744 


3530 


531b 


7102 


784CIP2D 94 


10173 


174 5 


3531 


531'. 


7103 


7B4CIP2D_9b 


10273 


1746 


3532 


533b 


7104 


784CIP2E_1 


3121 


1747 


3533 


531 ir 


710S 


784CIP2E^2 


3628 


1748 


3534 


5320 


7106 


784CIP2E_4 


3673 


1745 


3535 


532: 


7107 


784CI?2E_i 


4018 


1750 


3536 


5321 


7106 


784C2P2E C 


4467 


1751 


3537 


532> 


7109 


784CIP2E_7 


4865 


1752 


3538 


5324 


7110 


784C1P2E_6 


4916 


1753 


3539 


532i 


7111 


784CIP2E_S 


4923 


1754 


3540 


532( 


7112 


784C1P2E_10 


4926 


1755 


3541 


5327 


7113 


784C1P2E 11 


4962 


1756 


3542 


532*- 


7114 


784CIP2E_12 


4963 


1757 


3543 


5325 


7115 


784CIP2E_I3 


4964 


1758 


3544 


533C 


7116 


784CIP2E_14 


4988 


1755 


3545 


5331 


7117 


784CIP2E1S 


5835 


1760 


3546 


5332 


7118 


784dP2E_16 


7682 


1761 


3547 


5331- 


7115 


784CIP2E_17 


7682 


1762 


3548 


5334 


7120 


784C1P2E_1B 


7699 


1763 


3549 


53 3 S 


7121 


784CIP2E_19 


7707 


1764 


3550 


533f 


7122 


784CI?2E_20 


7707 


1765 


.3551 


5337 


7123 


784CI?2E_21 


7752 


1766 


3552 


533t 


7124 


784CIP2E_22 


8357 


1767 


3553 


533S 


7125 


784CIP2E_23 


9065 


1768 


3554 


534C 


7126 


784CIP2E_24 


9324 


176° 


3 555 


534: 


7127 


784C2P2F_J 


2976 


1770 


3556 


534 V 


7128 


784CIP2F_2 


3559 


1771 


3557 


534? 


7129 


784CIP2F_3 


4 021 


1772 


3558 


5344 


7130 


784CIP2F_4 


4474 


1773 


3559 


534^ 


7131 


784CIP2F_5 


4 566 


1774 


3560 


534C 


7132 


7 84CIF2F_6 


4705 


1775 


3561 


5347 


7133 


7B4CIF2F_7 


4707 


1776 


3562 


534fc 


7134 


784C1P2F_& 


4712 


1777 


3563 


534S 


7135 


784C1P2F_9 


5008 


1778 


3564 


535( 


7136 


784CIP2F_10 


5009 


1779 


3565 


535j 


7137 


784CIP2F_11 


5015 


1780 


3566 


53 5? 


713B 




5015 


1781 


3567 


5353 


7139 


784CIP2F_Jt3 


7724 


1782 


3568 


5354 


7140 


784CIP2F__14 


7725 


1783 


3569 


5355 


7141 


784CIP2F_1S 


8828 


1784 


3570 


535f 


7142 


784C1P2F_16 


8830 


1785 


3571 


5357 


7143 


784CIP2F_17 


9739 


1786 


3572 


535b 


7144 


784CIP2F_18 


9896 



TRADOCS: 14 16247. 1 (%CS70! !.D0C) 
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TABLE 7 



r seo 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amine acid 
secuence 


Frecicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptsce 
(A=Alanine, C=Cysteine, D=Aspsrtic Acid, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, 3>3 soleucine , K=Lyoine, 
L- Leucine; M=»Methionine, N*Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=pcssible nucleotide insertion) 


S359 


337 


1131 


AHLSARLSALI LDEVA3 LPAPQNLSVLSTNMKJ-ILLMWS PVI AFG 
ETVYYSVEYQGEYESLYTSHIKIPSSWCSLTEGPSCDVTDDITA 
TVP YNLR VRATLGSQTS / CLEHP /VS I PLI ETQ PSLPDL/RMEI 
TKDGFHLVIELEDLGFQFEFLVAYWRRSPGAEEHVKMVRSGG1F 
VHLETMSPGAAYCVKAQTFVKAIGRYSAFSQTECVEVQGEAIPL 
VLALFAFVGFML I LVW P L FVW KMGRLLQ/ YLLL PRGG S SQTPW 
KITQF 


£360 

| 


2 


Ills 


PRVRSSGGQEDFASQQKARPRFTQPSKMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTLSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVlORTRSKPVjbTGTKPVNTTVr 
FGGTTSFQCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 
WLPTGDVWSRPDGSYLNKLL1TRAR0DDAGMYICLGANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLLWLCOAO KKPCT F APAP PLPGHR? PGTARDRSGDKDLPSL 
AALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTGHS 
TPHTYTHPPPSCQLNSSHS 


j 5361 

i 


3 


925 


HEGS1SSANILLDDQFOPKLTDFAMAHFRSHLEHOSCTINMTSS 
SSKKLWYMPEEY I ROGKLS I XTDVYS FG IV I ME VLTGCRWLDD 
PKH1QLRDLLRE1.MEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGRCAA7RAKLRPSMDEVLNTLESTQASLYFAEDPPTSLKSFRC 
PSFLFLENVPS 3 P VEDDES QNNNLLPS DEGLR 1 DRMTQKT P r EC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


j 5362 


2 


4875 


SCOVEGCTRTYMSSOSIGKHMKTAHPDQYAAFKMQRKSKKGOKA 
NNLK7PNNGKF VY FLPS PWS SNPFFTSQTKANGNPACS AQLQH 
VSPPIFPAHLASVSTPLLSSMESVINPNITS0DKN2QGGMLCS0 
MENLPSTALPAQKEDLTKTVLPLNIDRGSDPFLSLPAESSS I DL 
FPSPADSGTNSVFSQLENNTNHYSSQIEGNTNSSFLKGGNGENA 
VFPS Q VNVANK F S STN A00S A P EKVKKDRGRGQTG KER KP KHN K 
RAKK PA 1 1 RBGK F I CS R CYRA FTN PRS LGGHLS KR S YCKPLDGA 
E I AOE LLQSNGQP S LLASM I LS TNAVNLQOPQQ S TFN PEACFKD 
PSFL0LLAENRSPAFLPNTFPRSGVTNFNTSVSQEGSE1I3OAL 
ETAG1 PS TFEGAEMLSHVSTGCVSDASOVNATVMPNPTVPPLLH 
TVCH PNTLLTNQNRTS NS KTS S I EECS S LP VFPTNDLLLKT VEN 
GLCSS SF2NSGG PSQN FTSNS SRVSV2 SGPQNTKSSHLNKKGWS 
ASKRRKKVAPPL1 APNASONLVTSDLTTMGLI AKSVEI PTTNLH 
SNVIPTCEPQSLVENLTOKLNNVNNOLFMTDVKENFKTSLESHT 
VLAFLTLKTENGDSQMMALNSCTTSVNSDLQ1SEDNVIQNFEKT 
LEII KTAMNSOI LEVXSGSOGAGETSQNAQINYNI QLPSVNTVO 
NNKLPDSSP\FSSFISVMPTESNIPOSE\VSHKEDQIOEILEGL 
OKLKLEKDLSTPASOC VLI NTS VTLTP TP VKSTADI TV! OPVSE 
MIN10FNDKVNK P FVCQNOGCN YSAMTKDALFKHYGKI HQYTPE 
MILE I KKNQLKFAPF KC WPTCTKTFTRNSNLRAHCQLVHHFTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEEI^KKESQ 
PALELRAETQNTHSNVAVI PEKQLI EKKS PDKTESSLQ VITVTS 
EQCNTNALTNTOTKGRKIRRHKKEKEEKKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIOONLILHYQAVHKSDLPAFSAEVEE 
ESEAGKESEETETKQTLKEFRCQVSDCSRIFQAITGLIQHYMKL 
HEMTP EE I ES MT A S V DVG K FP CDQ LECKS S FTTY LNYWHLEAD 
HGIGLRASKTEEDGVYXCDCEG CDRI YATRS NLLRHI FWKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKR KKKNNLEN KN AKI VQ I EENKPYS LKRG KHV YS I KAPJv 
DALSECTSRFV10YPCKI KGCTSWTSESNI IRKYKCHKLSKAF 
TSQHRNLLIVFKRCCNSOVKETSEOEGAKNDVKDSDTCVSESND 
NSRTTATVSQKEVEKNE 1, DEMDELTELFITKLINEDSTSVETOA 
NTSS NVS NDFQEDNL COS ERQKASNLKR VNKEKNVS QN KKRKVE 
KAEPASAAELSSVRKEEETAVAiQTIEEHPASFDKSSFKPMGFE 
VSFLKFLEESAVKOKKNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVOFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEC 
ID 

NO: 

; 

: 

| 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amino acid segment containing signal peptiae 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
K=Histidine, 1= Jsoleucine , K^Lysine, 
L=Leucine, /^Methionine , N=Asparagine ( 
P=Proline, Q=Glut amine, R=Arginine, 
S*Serine, T=Threor.ine, V*Valine, 
W-Tryptophan, Yt= Tyrosine, X=Unknown, *=Scop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VLKQLQEMKPTVSLKKLEVHSNDFDMSVMKDisiGKATGRGCY 


5363 

1 

\ 

I 

! 

i 

i 
i 

l 

i 

! 
1 

1 

i 
1 


8066 


7 02 


RLCCTC5GGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PP S W RRQ P P GG I R R D FSRRLRR E AN LVATCLPVRAS L PHR LNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKROAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSM2 WDCTC1 GAGRGK J S 
CTIANRCHEGGOEYKIGDTW^RPHETGGYKLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWM?'IVDCTCLGEGSGR 
ITCTSRNRCNDODTRTSYRIGDTWSKKDNRGNIjIjQCICTGNGRG 

ewkc5rhtsvqttssgsgpftdvraavyqp0phpqpppyghcvt 
dsgwys vgmola* ktqgnkqml \ctclgng vscqetavtqtyg 
gnsngepcvlpftyngrtfyscttegrqdghlwcsttsnyecdq 
kys fctdhtvlvqtrggnsngalchfpflynnhnytdcts egrr 
dnmkw cgttqn ydadqkfg f c pkaahee i cttn eg vmyr i gdqw 
dkqkdmghmmrctcvgngrgewtc i aysqlrdqci vdd i tynvn 
dtfhkrheeghmlnctcfgqgrgrwkcdpvdqcqdsetgtfyoi 
gdsw e ky vhg vryocy cygrg2gevjhcqp lqty psssg pvevf3 
tetpsqpnshpiqwnapqpshiskyilrwrpknsvgrwkeatip 
ghlns yt1 kglkpg wyegcl3 £ 1qqyghqevtrfdftttstst 
pvtsnt\vtgettpfsplvatsesvteitassfwsvjvsasetv 
sgfrveyelseegdepqylvlpstatsvXnipXdllpgrkyivn 
vyqi sedgeqsh lstsqttappappdptvdqvddts ivvrwsr 
pqapitgyrivyspsvegsstel.nlpetansvtlsdlopgvqyn 
itiyaveenoestpw1qoettgtprsdtvpsprdlqfvevtdv 
kvtimwtppesavtgyrvdvip\ r nlpgehgorlplsrntf\aek 
tglspgvtyyfkvfavshgreskpltaoottkl\daptnlofvn 
etdstvlvrwtppraqitgykltvgltrrgqprqynvgpsvsky 
plrwlqpas e ytvslval kgnqe s pkatgvfttlqpgssi ppyn 
tevtett 1 v i twtpapr i gfklg vrpsqggeaprevtsdsgs 3 v 
vsgltpgve yvyti 0vlrd30erdap \ i vnk\wtplspptnlh 
leanpdtgvl'tvs w ers ttpd2 tg yr i tttptngqqgnsleevv 

HADQS SCTF \ DNLEVPGL2YNVS VYTVKDDKES VPI SDTI I PAV 
PPPTDLRFTN/ 1 LGPDTMRVTW\APPPS I DLTNFLVFYS PVKNE 
GRMLOSLS 1 FFLSDN\AWLTNLLPGTEYWS VSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\D1TA\NSFT\VHW\IAPRA/TPI 
TGYRIR\KHFEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
SIVALNGREES PLLIGQQSTVSDVPRDLEVVAATPTSLLI \SWD 
APAVTVRYYR ITYG ETGGNS FVC E FTVPGSKS TAT I SGLKPGVD 
YTITVYAVTGRGDSPAS SKP1S1 NYRTEIDKPSQMQVTDVQDNS 
I S VKWLP S S S PVTGYRVTTT \ P KNG PG \PTKTKTAG PDQTEMT I 
EGLQ PTVE YW3 VY AQN PSGESQ P LVQTAVTMI DRPKGLAFTDV 
DVDS I KI AWES PQGQVSRYRVTY SSPEDGIHELFPAPDGEEDTA 
ELOGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAOWTPFNVQLTGYRWVTPKEKTGPMKE1NLAPDSS 
SVWSGLMVATKYEVSVYAl.KDTLTSRPAQGWTTLENVSPFnU 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 
DVRSYTI TG LQ PGTDYKI YLYTLNDNARSSP WI DASTAI DAPS 
NLRFLATTPNSLbVSWOPPRARlTGYIIKYEKPGSPPREVVPRP 
RPGVTEATITGbEPGTEYTlYVIALKNNQKSEPLIGRKKTDELP 
OLVTLPHPNLHGPE3LDVPSTVQKTPFVTHPGYDTGNGIOLPGT 
SGQQPS VGQOM I FEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFODTSEYIISCHPVGTDEEPLOFRVPGTSTSAT 
LTGLTRGATYNI I VEAiXDQORKKVREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERWSESGFKLLCQCIiGFGSGHFRCD 
SSRWCHDNGWYKIGEKWDROGENGQMMSCTCliGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYIjGAICSCTCFGGQRGWRCDNCR 
RPGGE PS PEGTTGOS YNOYSQR YHQRTNTNVNCP I BCFM PLDVQ 
ADREDSRE 


5364 


8066 


703 


RIiCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREAmjVATCIiPVRASLPHRLNMli 
RGPG PGLliLlAV LC LGT AVPSTG ASKS KRQAQQMVQPQS P V A VS 
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SEO 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
correspond! no 
to first 
amino acio 
residue oi 
amino acic 
sequence 


Predicted end 
nucleotide 
1 oca tier, 
corresponding 
to first* 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ! 
(A=Alanine, OCysteine, D=Aspartic Acid, Er j 
Glutamic Acid, F=?henylalarune, G=Glycirie, j 
H=Histicine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q-Glutamine, R*Arginine, | 
S«=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion) 








OS K PGC Y DWG KH YQI NQQWER T Y1/3NAL VCTC YGGSRG FN CES K / 

P EAE ETC FD K Y TGNTY R VGDTY ER PKDSM I WDCTC I G AG RGR I S 

CT I AMR CH EGG OS YK 1 GDTWRR PHETGGY MLECV CLGN G KGFJKT 

CKPIAEKCFDILAAGTSYVVGETWEKPYQGWMMVDCTCLGEGSGR 

ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLbOCICTG^GRG 

EWKCERHTSVOTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 

DSG WYS VG MO LA * KTQGN KQML\ CTCLGNG VS CQET AVTQTYG 

GN S NGE P CVL P FT YNG RT? YS CTTEG RQDGHLWCSTT SNYEQEO 

KYSFCTDKTVLVOTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

DNMKWCGTT0NYDADQKFGFCPMAAHEE1CTTNEGVMYRIGD0W 

DKOHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVK 

DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDOCQDSETGTFYOI 

GDSWSKYVHGVRY0CYCYGRGIGEWHCQPLQTYPSSSGPVEVF3 

TETPS0PNSHPI0WNAPQPSHISKY1LRWRPKNSVGRWKEATIP 

GKLNSYTI KGLK PG WY EGQLISI QQ YGHQEVTR FDFTTTS TST 

PVTSNTWTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYEL S EEGDEPQYL VLPS7ATS V\NIP\DLLPGR KY I VK 

VYQIS EDGEOSLI LSTSQTTAPDAPPDPTVDOVDDTS I WRWSR 

P0APITGYR1VYSPSVEGSSTELNLPETANSV7LSDLQPGVQYK 

IT 1 Y AVEENQESTPW 1 QQETTGTPRSDTVPS PRDLQFVEVTDV 

KVTI MWTF PESAVTGYRVDVI PVNLPGEHGQRLPLSRNTF\ AEK 

TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDSTVLVRWTP PRAQ I TGYRLTVGLTRRGQPRQYNVGP SVS KY 

PLRNLQPASEYTVSLVAI KGNQESPKATGVFTTLQPGSS IPPYK 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLTPGVEYVYTIQVXRDGQERDAP\IVNK\VVTPLSPPTNI,K 

LEAN PDTGVLTVSWERSTTPDITGYR ITTTPTNGQOGNS LEE W 

hadqssctfNdklevpgleynvsvytvkddkesvp I SDT1 1 PAV 
pfptdlrftn/ilgpdtmrvtw\apppsidltnflvryspvkne 
grmlosls i ffls dn \awltnllpgteyvvsvssvyeqhestp 
\ lrgrqktglds p\tgi dfs \ dita\ ws ft\vhw\ i apra/ tpi 
tgyrir\hhpehf\sgrpredr\vphsrnsitltnltpgteyw 
sivalngreesplligoostvsdvprdlewaatptslliXswd 
apavtvryyr1tygetggnspvqeftvpgskstatisglkpgvd 
ytj tv yavtgr gds pass kpi sinyrte i dkpsqmqvtdvqdns 
isvkwl?ssspvtgyrvttt\pkngpg\ptktktagpdqtemti 
eglo p tv e yvvs v y aqn p sge sqplvqta vtn i dr p kgl aftd v 
dvds 1 k i awes fqgqvsryrvtyssp edgi helfpapdgeedta 
eloglrpgseytvs walhddmesqpli gtqstai paptdlkft 
qvtptslsaowtppnvoltgyrvrvtpkektgpmkeinlapdss 
s vwsglmvatk y e vs vyalkdtlts rpaqgwttlenv s p prr 
arvtdatettit i swrtktetitgfqvdavpangqtp iqrti kp 
dvrsytitglqpgtdykiylytlndnarsspwidastaidaps 
nlr flattpn s llvswc p praritg y 1 1 ky ekpgs ppre w pr p 
rpgvte ati tglepgteyti yvialknnoksepli gr kktdelp 
qlvtlphpnlhgpeildvpstvqktpfvthpgydtgngiqlpgt 
sgqqpsvgqomifeehgfrrttppttatpirhrprpyppnvgqe 
alsqtt i s wap fqdtsey 1 1 s ch p vgtdee pix2 fr vpgtsts at 
ltgltrgatyn 1 1 vealkdqqrhkvreewtvgnsvneglnopt 
ddscfdpytvshyavgdewermsesgfkllcqclgfgsghfrcd 
ssrwchdngvnykigekwdrqgengqmmsctclgngkgefkcdp 
heatcyddgktyhvgeqwokeylgaicsctcfggqrgwrcdncr 
rpggepspegttgqsynqysqryh0rtntmvncp1ecfmpldvc 
adredsre 


5365 


B066 


703 


RLCCTGGGSGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGiRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKROAQQIWQPQSPVAVS 
QSKPGCYDNGKKYOINQQWBRTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSKIWDCTCIGAGRGR1S 
CTIANRCHEGGOSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDF^GTSYWGETWEKPYQGWIWDCTCLGEGSGR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid sepment containing signal peptide 
(TUAianine, OCysteine, D=Aspartic Acid, E* 
Glutamic Acic, F= Phenylalanine, G=Glycine, 
H=Histidine, J=Isoleucine, K=Lysine, 
L=Leucir.e, K=Methionine, N=Asparagine , 
P= Proline, Q^Gluta.-nir.e , Rs=Arginine, 
£=Serine, T=Threonine, V^Valine, 
W=Tryptophc>n / Ys= Tyrosine, X=Unknown, * = Stop 
Codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion) 


i 
i 






ITCTSKNRCNDQDTRTSYRIGDTWSKKDNRGNLL0C1CTGNGRG 
EWKCERHTSVOTTSSGSGPFTDVRAAVyQPQPHPOPPPYGHCVT 
DSGWYSVGMOLA*KTQGNKQML\CTCLGNGVSCCETAVTQTYG 
GNSKGEPCVLPFT^GRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSXGALCHFPFLYNmitYTDCTS EGRR 
DNMKWCGTT0NYDADQKFGFCPMAAHEEICTTMEGVMYR1GDQW 
DKOKDyjGKMKRCTCVGNGRGEWTClAYSQLRDOCIVDDITYNVN 

dtfhkrheegkmlnctcfgqgrgrwkcdpvdqcodsetgtpyqi 
gds w kky vhgvr y qcy c ygrg3 gewhcqplqtypss sg p vevfi 
tetpsqfnshp10wnapqpshiskyilrwrpknsvgrwkeatip 
ghlksyt3kglkpgwyeg0lisiq0ygkqevtrfdptttstst 
pvtsnt \ vtge ttf fs pl vats es vteitass f v vswvs asdtv 
sgfrveyelseegdepqylvlpstatsv\n3p\dllpgrky3v7c 
vyq3 sedge0sl3 lstsqttapdappdpt\0)qvddts3 vvrwsr 
p0apitgyrivyspsvegsstelklpetansvt1.5dl0pgvqyn 
1tiyaveen0e5tpwiqqettgtprsdtvpsprdlcfvevtdv 
kvt1 mwtfpesavtgyrvdvi pv^lpgemgqrlplsrntf\aen 
tgls f g vt y y f k v fav s hg r es kp ltaqqttkl \ daptn lq f vn 
etdstvlvrktppraoitgyrltvgltrrgqproynvgpsvsky 
plrnlopas e y tvslva3 kgmqespkatgvfttlopgss i p? yn 
tevtettiv3twtpafr1gfklgvrps0ggeaprevtsdsgsiv 
vsglt pgvey^tl ovlrdgqerdap\i vnk\wtplspptnlh 
leanpdtgvltvswersttpditgyritttptngqqgnsleew 
had0£ sctf\dnlf.vpgleynvsvytvkddkesv?3 sdti 1pav 
ppptdlrftn/:lgpdtmrvtw\apppsidltnflvryspvkne 
grmlqs ls i ffls dn\awltnllpgt2ywsvs s vyeqhestp 
\lrgrqktgldsp\tgidfs\dita\nsft\vhw\iapra/tpi 

TGYRJR\HHPEJ}F\SGRPREDR\VPHSRJJSITL , nsrLTPGTEyW 
SIVALNGREES PLLI GQQSTVSDVPRDIjEWAATPTSLLI \SWD 
APAVTVRY YR 1 T YG ETGGNS ? VQE FTV PGS KS TAT I S GLK PGVD 
YT3 TV YAVTGRGDS PASS KP I S I N YR TE I DK PSQMQVTD VQDNS 
I SVKKL? SSS PVTGYRVTTT \ PKNGPG \ PTKTKTAGPDOTEMT I 
EGLO PTVE YWS VY'AONP SGESQPLVQTAVTNIDRP KGLAFTD V 
DVDS 3KI AWES POGOVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
EL^LRPGSEYTVSVVALHDDMESOPLIGTOSTArPAPTDLKFT 
OVTPTS LSAOKTPPIJVQLTGYRVRVTPKEKTGPMKEI N1APDS S 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGVVTTLENVSPPRR 
ARVTDATETT 3 T I S WRTKTETI TGFQVDAVP ANGOTP IQRT3 KP ' 
DVRSYT I TGLO PGTDY KI YLYTLNDNAR SSPWI DASTA3DAPS 
NbRFLA'lTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYT1YVIALKNNQKSEPL3GRKKTDELP 
QIYTLPKPXUIGPEI LDVPS7VQKTPFVTHPGYDTGNG1 QhPGT 
SG00PSVGQ0M3FEEHGFRRTTPPTTATPIRHRPRPYPPNVG0E 
ALSCTT33WAPFQDTSEYI3SCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYN3 IVEALKDQORHKVREEVVTVGNSVKEGLNQPT 
DDSCFD P YTVSH YA VGDEWERMS ESGF KLLCQCLG FG S GH PR CD 
SSRWCHDNGVNYK1GEKWCRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYKVGEQWQKEYLGAICSCTCFGGORGWRCDNCR 
RPGGFPSPEGTTGOS YNQYSQRYHQRTNTNVNCP3 ECFMPLDVQ 
ADREDSRE 


53*6 


8066 j 703 

i 
1 

! 
| 

i 
| 

! 


RLCCTGGGEGTPGASGKRGPAATTSLVLC1PSVPPPVPFPTLWP 
PPSWRROPPGGIRRDFSRRLRREANLVATCLPVRA.SLPHRl.NML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVOPQSPVAVS 
QSKPGCYDNGKHY01NQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMXWDCTC3GAGRGR1S 
CTIANR CHEGGOSY K I GDTWRRPHETGGYMLECVCLGNGXGEWT 
CKPIAEKCFDKAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
I TCT5RJCR CNDQETR TS YRIGDTWSKKDNRGIC/LOC3 CT6NGRG 
EWKCERKTSVOTTSSGSGPFTDVRAAVYQP0PHP0PPPYGHCVT 
DSGWYSVGMQLA* KTQGNKQML\ CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYMGRTFYSCTTEGRQDGHLWCSTTSNYEQDO 
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SEQ 
ID 
.NO: 


Predicted 
beg i nr. inc 
nucleotidt 
location 
correspond ng 
to first 
amino acic 
residue ci 
amine acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


k~dno acic segment containing sicnal peptide 
iA=Alanine ( C=Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
K=Histidine, I^isoleucine , K= Lysine, 
'•^Leucine, M = Methionine, N=.AsDaraoine , 
:-=Proline, Q=G] u tamine, R=Arginine, 
£=Serine, T=Threonine, V=Valine i 
"^Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Ccdon, /=possible nucleotide deletion, 
\=pcssible nucleotide insertion; 


i 






KysFCTDHTVLVOTRGGNSNGALCHFPFLyNNKNYTDCTSEGRR 
ENMKWCGTTOW YBADO KFGFCPMAAHE E 1 CTTNEGVM Y R I GDQW 

e kqhdmghkkrctcvgngrge wtci ay sqlkvqc i vdd i tynvn 
dtfhkrheegkmlkctcfgqgrgrwkcdpvdoc^dsetgtfxoi 
gdswbkwkgvryqcycygrgigewhcoplqtypsssgpvevfi 
';etpsqpkshpiqwkapopshiskyilkwrpki>isvgrwkeatip 
gh lksyt2kglkpgv\ r yegqhs i oovgk0evtrfdftttstst 
?v7skt\vtgettpfsplvatsesvteitassfwswvsasdtv 
5gfrveyelseegdepqylvlpstatsv\nip\dllpgrkyivn 
\^0i^edgeosl.ilstsqttapdappdptvdovddtsiwrw£r 
poapitgyrivyspsvegsstelnlpetansvtlsdlqpgvqyn 
z ti yaveen0estpwiqqettgtprsdtvpsprd1>qfvevtdv 

K\^IMWTPPESAVTGYRVDVIPVNLPGEHGORLPLSRNTF\a£N 

tglspgvtyyfkvfavshgreskpltaqqttkl\daptnlqfvn 
etdstvlvrwtppra01tgyrltvgltrrgqprqynvgpsvsky 
f lr nlqpas e yt vs lv a i xgnqe s p katg v ftt lqpg s s i p p yn 
tevtettivitwtpaprigfklgvrfsoggeaprevtsdsgsiv 
v5jglt?gveyvytiqvlrd6qerdap\ivnk\wtpiispptnlh 
i.ea^ifdtgvt-tvswersttpditgyritttptkgqqgnslieew 
iiadossctf\dnlevpgleynvsvytvkddkesvplsdtilpav 
f7ptdlrftn/ilgpetmrvrvj\apppsidltnflvryspvkne 
gkmloslsifflsdnXawltnllpgtkywsvssvyeohestp 
\lrgroktgldsp\tgidfs\dita\wsft\vhw\iapra/tpi 
tgyri r\hhfehf\sgrpredr\vphs rnsi tltnltpgteyw ! 

£ I VALfvTGREESPLLIGQQSTVSDVPRELEVVAATPrSLLl \StfD 
A FAVTVRYYR3 TYGETGGNS P VOEF TV PGSKSTATI SGLKPG VD 
YT3 TVYAVTGRGDSPASSKPISINYRTS IDKPSQKQVTDVQDMS 
j$VKWLPSSSPVTGYRVTTT\PKKGPG\PTKTKTAGPDOTEMTI 
L'GL0FTVEyWSVYAQNPSGESQPl.V0TAVTN3DRPKGLAFTDV 
D\T3S1KIAWESP0GQVSRYRVTYSSPEDGIHELF?APDGEEDTA 
ELCGLRPGSEYTVSVVALHDDMESOPL J GTQSTAI PAPTDLKFT 
OVTPTSLSAOWTPPNVOLTGYRVRVTPKEKTGPMKEINLAPDSS 
S\ r \TVSGLMVA TKYEVS VYALXDTLTS R PAQGWTTLENVS PPRR 
ARVTDATSTT1TISWRTKTETITGF0VDAVPAKGQTPIQRTIKP 
DVRSYT I TGLO PGTDYK 1 YLYTLNDNAR SSP W2 DASTAI DAPS 
NLRFLATTPKSLLVSWQPPRARITGYI 1 KYSKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKT7NOKSEFL1GRKKTDELP 
pi.VTLPHPNLHGPEILDVPSTVQKTPFVrHPGYDTGNGIOLPGT 
SGOQPS VGQQNil FEEHGFRRTTPPTTATPI RHK PR P YPPNVGQE 
A LSQTT J S WAPFQDTSEY 3 1 S CHPVGTPEEPLOFRVPGTSTSAT 

rOSCFDPYTVSKYAVGDEKERMSESGFKLLCOCLGFGSGHFRCD 
SSRWCHDNGWlYKIGEKWDRQGENGOKr^SCTCLGNGKGEFKCDP 
H E ATCY DDG KTYHVGE OWOKEYLG A3 C S CTCFGGQRGWRCDNCR 
K PGGEPSPEGTTGQSY^QYSQRYHQRTKTNVNCPI ECFMPLDVQ 
ADREDSRE 


5367 


235 


3593 


KKILNMLCKKKJIVIEYLADILYEYLYGFCFSGIKKYLIIHVLRL 
3 LELKMTRLLLEKSVSLQTQYLIiLI VK7 hS WFPGKEMRHHLQIM 
EVWMRKQDS /R I VGNGSEOQLOKELADVLMDPPMDDOPGEKELV 
KKS QLDG EGDG P LSNQLS AS ST IN PV P L V G LQKP EMS LP VKPGQ 
GDSEASSPFTPVADEDSWFSK1»TYLGCASVNAPRSEVEALRMM 
S1LRS0CQISLEVTLSVFNVSEGIVRLLDP0TNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LY £ FATA FR RS AKQT P LS ATAAPQTPD SDIFTFSVSLEI KEDDG 
KG V FSAVPKDKDRQCFKLRQGIDKKI V3 Y VQOTTNKELAI ERCF 
GLLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVITGSWNPKSPH 
FC-^^ETPKDKVLFMTTAVDLVITEVOEPVRFLLETKVRVCSP 
KE RLFWPFS KR STTEK FFLKLKQI KQRE R KNNTDTLY E WCLBS 
EE IRERRKTTASPSVRLPQSGSQSSVI PS PPSDDEEEDNDEPLL 
Sr-SGDVSKECAEKILETKGELLSKWHLNLNVRPKOLSSLVRNGV 
PEALRGEVWQLLAGCHNNDHLVEKYRILITKESPODSAITRDIN 
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SEQ 
ID 

NO: 


Predicted I 

becJnninc 

nucleotide 

location 

cor responding 

to first. 

axino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(A=Aianine, C=Cysteine, D*Aspartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
K^Histidine, 3«Isoleucine, K=Lysine, 
L-Leucine, tf=Methionine, N=Asparagine , 
P=Prcline, 0=Glutamine , R=Arginine, 
S=Serine, T-Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion.. 
\=possible nucleotide insertion) 








RTFPAHDYFKDT6GDGQDSLVK1CXAYSVYDEEIGVC0G0SF1A 
AVLLLHKPEEQAFSVLVKIMFDYGLRELFKQNFEDLHCKFYQLE 
RLMQEYl PDhYiVHFLD I S L FJLHM YASQWFLTLFTAK F PL YM VFH 
1 1 DLLLCEG I S VI FNVALGLLKTS KDDLLI .TDFEGALK FFRVQL 
PKRYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMKEQOAQQ 
EDPIERFEREKRRLQEANMRLEQENDDLAHELVTSK3ALRKDLD 
NAEEKADALNKSLLMTKQXLIDAEEEKRRLEEESWiLKKMCRRE 
LDKAESEIKKNSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKG1SSTKEVLDEDTDEEKETLKNQL 
REMELELAQTKL\QLVEASCK1QD\LEHPF*GLPFNE\V0;AA\K 
KTWFNRTLSSIKTATGVQGKETC 


5368 


57?. 


2014 


GAAAGAADP RRG S LGGRTMLDFAI FAVTFLLALVGAVLY LY PAS 
ROAAG I PGI TPTEEKDGNLPD1 VNSGSLHEFLVNLH ER YG P WS 
FW FG RRL WS LG TVD VL KQHIN PN KTLD /L F * NHAE V 1 1 KVSIW 
WWOCE * KP\ORKKLYENGVTDSLKSNFALLLXLPEELLDKWLSY 
FET0H\VPLS0HMLGFAMKSVTQMVMGSTFEDDOEVlRF0KNHG 
TWJSE I G KG FLDGSLDKNMTRKKQY SDALMObES VLRN H KERK 
GRNFS0KIFIDSLVQGNLNDQC1LED5M1 FSLASCI ] TAKLCTW 
AlWFLTrSEEVOKKLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTPVS AQLQDI EGKI DRF 1 1 PRETLVLYALGVVLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGFSGTQECPELRFAYMVT 
TVLLS VLVKRLHLLSVEGQVI ETKYELVTS S REEAW 1 TVS KRY 


5369 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTHSASFVPNGASLED 
CKCNLFCLADLTGIKWKKYVWQGPTSAP1LFPVTEEDP3LSSFS 
R CL KAD VLG / VWR RDQRPERR E \ L * I FWGGEDP \ VL LT J »FTMT Y 
OKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRUFVR I GKWF 
VKPYEKDEKP1NKSEHLSCSFTFFLHGDSNVCTSVE1N0H0PVY 
LLSEEHITLAQOSNSPFQV1LCPFGLNGTLTGQAFKMSDSATKK 
LIGEWXOFYPlSCCLKEMSEEKQEDMDWEDDSLAAVE\ r L.VAGVR 
MI Y PACFVLVPQSDI PTPS P VGSTHCS SSCLGVHQVPASTRDPA 
MSSVTLTPPTSPESVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 
GGK1 PRKliATmVVDRVWQECNMNRAONKRKYSASSGGLCEEATA 
AKVASWDFVEATORTNCSCLRHKNLKSRNAGQQGQAPSLGOOQQ 
ILPKHKTNEKQEKSEXPQKRPLTPFHKRVSVSDDVGMD\ADS\A 
SQRLV\ I SAP\DSQ\ VRFSNIR\TKDVAK\TPOMHGTEfviANS PQ 
PPPLS P \HPCDWDEGVTKTPSTPQ SQKFYQMPTPDPLV PS KPM 
EDR 1 DS LSOSFPPQ YQEAVEPT VYVGTAVNLEEDEAN I AW KYYK 
FPKKKDVEFLPPOLPSDXFKDDPVGPFGOESVTSVTELMVQCKK 
PLKVSDELVQQYQI KNQCLSAIASDAEQEPKI DP YAFVEGDEEF 
LFFDKKDRONSEREAGKKHKVEDGTSSVTVLSHEEDAMSLFSPS 
I KODAPR FTSKARP PSTS LI Y DSDLAVS YTDLDNL FNSDEDELT 
PGS KR S ANGSDDKAS CKE S KTGNLD PLS C I STADLH KM Y P TP P S 
LEQK I MG FS PMNMNN KEYGS MDTTPGGT VLEGNSS S 1 G AQ FK I E 
VDEGFCSPKPSEIKDFSYVYKPENCQILVGCSMFAPLKTLPSQY 
LPL3 KLPEECI YRQSWTVGKLELIjSSGPSMPPI KEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTFRGAGC-PASAOGSVKYENSDLYSPASTPSTCRPLKSVEP 
ATVFS1PEAHSLYVNLILSESVMNLFKDCNSDSCCICVCKMNIK 
GADVG V Y I PDPTQEAQYT?CTCG FSAVMNRKFGNNSGLFFEDELD 
IIGRNTDCGKEAEKRFEALRATSAEHVNGGLKESEKLSDDLILL 
LQDQCTNLFSPFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 
LEHGRQ FMDNMS GG KVDEAL VKS S CLH P WS KRNDVSMOCSQD 1 1/ 
R MLLS LQ PV LQDAI Q KKRT VR P WGVQG PLTWQQFH KMAGRGS YG 
TDESPEPLPI PTFLLGYDYDYLVIjSPFALPYWERLMLEPYGS QR 

diaywlcpeneallngakskfrdltaiyescrlgqhrpvsrll 
tdgimrvgstaskklseklvaewfsoaadsnneafsklklyaqv 
crydlgpylaslpldssllsqpnlvaptsqslitpp0m7ntgna 
NTPSATLASAASSTMTVTSGVAI stsvatanstlttastsss ss 

SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQAKTVQSGQLGGQ 
C*TS ALQTAG I SG ESS S LPTQPH PD V£ ESTMDRDKVG I PTDGDSH 
AVTYPPAI WYI 3DPFTYENTDESTNSSSVWTLGLLR CFLEMVQ 
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SEQ 1 
ID 

NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end | 
nucleotide 1 
location 
corresponding 1 
to first 

amino acic ! 
residue cf \ 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
K=Histicine, I=3soleucine , K=Lysine, 
L=Leucir.e, M*Methionine, N^Asparagine, 
P= Proline, Q=Glutamine , R=Arginine, 
S-Serine, T=Threonine, V= Valine, 
N=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








TLPPHI KSTVS VQI I PCQY bLQ PV KHEDREI YPQHLKS LAFSAF 
TO CR R PL PTSTNVKTLTG FGPG LAM ETALR S PDRPECI RL Y AP P 
FILAPVKDKQTELGETFGEAGQKYNVLFVGYCLSHDQRWILASC 
TDLYGEbLETCIINIDVPNRARRKKSSARKFGLQKLWEWCLGLV 
OMSSLPWRWIGRLGRIGHGELKDWSCLLSRRNLOSLSKRLKDM 
CRMCGISAADSPSILSACLVAMEP0GSFV1MPDSVSTGSVFGRS 
TT LNMQTS QLNT PQDT S CTH I LV F PTS AS VQ V AS AT YT TE N LDL 
AFNPNNDGADGMGIFDLLDTGDDLDFDI INI LPASPTGSPVHSP 
G S HY PHGGDAG KGQS TDRLLS TE P H EE V PN I LQQPLALG Y F VST 
AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLHS 
KHSHPLDSNQTSDVLRFVLEQYNAbSWLTCDPATQDRRSCLPIH 
FW LNQT.1 Y N F I MNML 


5370 


1226 


71t 


RKSRKLELRRAAQATESRPPQSQEMHPFTGKEVHALKRLRDSAN 
AKDVETVQ0LLEDGADPCAADDKGRTALHFASCMGND0IVOLLL 
DHGADPNQRDGliGNTPLHliAACTNHVPV 1 TTLLRGGAR VDALDR 
AGRTPLHLAKSKLNILQEGHAQCLKAVR/HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSliALAESLSbFRACTSLPVG 
GCISWL 


5371 


1331 


16' 


JA/^MLWKLLLRSOSCRLCSFRKWRSPPKYRPFLACFTYTTDkCS 
SKENTRTVEKLYKCSVDIRKIRR\*KDGYF*RMKPMLKKLRI/F 
LQELGADETAVASILERCPEAIVCSPTAVKTORKLWQLVCKNEE 
ELIKLIEQFPES FFTI KDQENQ KLN VQF FQELGL KNW 1 S RLLT 
AAPNVFHNPVEKNKQMVR I LQES Y LD VGGSEANMKWLLKLLSQ 
KPFI LLNSPTAI KETLEFLQEOGFTSFE1 LOLLSKLKGFLFQLC 
PRSIQNS1SFSKNAFKC7DHDLKQLVLKCPALLYYSVPVLEERM 
OGLLREGISIAQIRETPM^/LELTPQIVOYRIRKLNSSGYRIKDG 
KLANLNGSKKEFEANFGKIOAKKVRPLFNPVAPLNVEE 


5372 


53 


85'. 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLI bLFVTELSGAHNTTVFOGVAGOS LQVSC PYDSMKHWGR 
R KAW CRQLG EKG PCQR WS THN LV7LL S FLRRWNG STA I TDDTLG 
GTLTITbRNLOPHDAGbYQCOSLHGSEADTLRKVLVEVlADPLD 
HRDAGDbWFPG\DLRASRM?MWSTAS?SASWKEK3PSHPL?SFS 
SWPASFSSRF*QPAPSGLQPGMDRSOGHIHPVNWTVAMT0GISS 
KLCOG 


5373 


2814 


34t 


VKKTKSIFNSAMQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
OMLLD PTN P S AGTAKI DKQEKV KLN FD MTAS P K I LMSKP VLSGG 
TGRR I SLSDMPRSPMSTNS SVHTGSDVEODAE KKATSSHFS ASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLJSPKRQ 
I R S RFQLNLDKT I ESCKAQLG I NE I S E D VY TAVEHSDSEDS E KS 
DSSDSEY1SDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVE I KEELKSTSPASEKADPGAV KDKAS PEPEKDFSGKAKPS 
FHPIKDKLKGKDETDSPTVHLGLDSDSE\KELVIDLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSKSPPETPVIjTRSSAO 
TS AAGATATTS TSST VTVTAPAPAATG S P VKKQR PLLPKE \ TAP 
AVQR S CGTSSTVQQKE I TQS PSTST I TLVTSTQSS PLVTS SGSM 
STLVS SVNGDLP IGTASADVAADI AKYTSKLN MDAIKGTM\TEI 
YNDLSKN\TTWKAQIiAEDSQGLRI E I EKLQWLHQQEL\SEMKHN 

leltmaemrosweoerdrliaevkkqlelekqqavdetkkkowc 

AKFKKEAIFYCCWNTSYCDYPCQ\C)AHWPEK\MKSCTOSATAPQ 

\oeadae\vntetlnkssogsssstosapsetasa\skeketsa 

EKSKESGSTLDLSGSRETPSSILLGSNOGSDHSR\SKKSSWSSS 

dekrgsXtrsdhn/tpstqhgrsllpgkesragtpflgtsk 


5374 


2814 


34* 


VKKTKSIFNSAMQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
OMLLDPTNFSAGTAKIDKQEKVKLNFDMTASPKIL.MSKPVLSGG 
TGRRI SLSDMPRSPMSTNSSVHTGS D VEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPFSPQLSAPITTKTD 
KTS TTGS I LN LNLDRSKAE MDLKE L S E 5 VQQQSTP VPL3 S PKRQ 
IRSRFOLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEXS 
DSSDS EYI SDVEQKS *GTSOEDTEDKEGCQMDKEPS AVKKKPKP 
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Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locaticr. 
ccrresponcing 
to first ■ 
amino acic 
residue oi 
amino acic 
sequence 


Amino acid segment containing signal peptide ! 
(A^Alanine, C-Cysteine, D*Aspartic Acid, E=- 
Glutamic Acid, F=Phenylalanine, G^Glycine, j 
h-hiscioine, i-i bcxsucine ; *\=jjyBints, ; 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q-Glut amine, R=Arginine, 
S*Serine, T= Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion} 


} . „. .. j 






TNPVEI KEELKSTSPAS EXADPGAVi&XASPEPBKDESGKAKPS 
PHP I KDKLKG KDETDS PTVHLGLDSDSE\NELV I DLGEDHSGRE 
GRKNKKEFKEPSPKODWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TS AAGATATTSTSSTVTVTAPAPAATGS P VKKQRPLLP KE \TAP 
AV0RSCGTSSTV00KEIT0SPSTSTITLVTST0SSPLVT5SGSM 

YNDLSKN\TTWKAQLAEDS0GLRIEIEKLQWLHQQE1j\SEMKHN 
LELTf-tAEMRQSWEQERDRLIAEVKKOLELEKQQAVDETKKKOWC 
ANFKKEAI FY CC WNTS Y CDYPC0\0AH WPEH\ MKSCTQSATAP0 
\0EADAE\VNTETLNKSSOGSSSSTOSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSI1.IjGSNQGSDHSR\SNKSSWSSS 

dekrgsYtrsdkn/tpstohgrsllpg kesragtpflgts k 


5375 


2907 


1136 


hiflaeeepmlerrcrgplamgpaqprllsgpsoespqtlgkes 
rglrqqgtsva\osgaoapgrahrcahcrrhfpgwva\lwlhtr 
rcqa/rglplpcpecgrrfrhapflalhrqvkaaatpdwgfach 
lcgqsfrgwvalvlhlrahsaakagpfacpkmardafvfrrkaas 
ssi lrrchpsrprgprpfi cgncgrsilptwdq/lkvahkrvhv 
srrp*ergppakvfwgprprgpptgdtppgpggdavdrpf\qca 
ccgkrfrhk\pnuirshaacrsgerphq/csrecg\krftnkpy 

LTS\HRRITHTAROPYPCKECGRRFRHKPNLIjSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPOESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDPIEAPPSLYSCDDCGRS7RLERFLRAHQRQHTGERPFTCAEC 
GKNFGKKTH1.VAHSRVHSGERPFRLARKCGRRFLPRAS0SGGRK 
SAE PNAPRFGPFVCPDCGKAFRHKPYLAAHRPI ATPAEK PYVCP 
DCRKAFSQKSNLWSKRRIHTGERPYACPDCDRSFSQKSNLITH 
RKSHI RDGAFCCAICGQTFDDEERLliAHQKKHDV 


5376 


4504 


593 


vst fs lc lw paggggrgr vs nmaqs krhvy s rtp sgsrms ae as 
arpfcrvgsr vev2 gkghrgtvay vgatlfatgkwvgvi ldeakg 
kndgtvg<3rkyftcde ghg i fvrqs q i qvfedg adttsp et pds 
sas kvlkregtdttakts klrglkpkkaptar ktttrrp kptrp 
astgvagassslgpsgsasagelsssepstpaqtplaap31ptp 
vlts pgavp plps ps ke eeglraq vr dlee kletlrlkrae dka 
klk el ekhki qleqvqewks kmoeqqadlqrrlkearkeake al 
eakerymeemaptadai ematldkemaeeraesiioqeveabxer 
vdelttdleilkaeieekgsdgaassyqlkqleeqnarlkdalv 
rmrdlsssekqehvkXloiclmekknoelevvroorerloeelso 
aes t3 delkeqvdaalgaeemvemltdrnlnleekvrelretvg 
dleamnemndblqenare'telelreqldmagarvreaqkrveaa 
qm'adyqqtikkyroltahlodvnreltnqqeasverqqqppp 

ETFDFKI KFAETKAKAKAI EMELRQME VAQANRHMSLLTAFMPD 
SFLRPGGDHDCVLVLLLMPRLI CKAELI RKQAQEKFELSENCSE 
RPG LRGAAGEQLS FAA1 GLVY\ SLMPAAGHRYHRY * CHALSQCR 
LD\ VY K KVGSLY P EMS AKERSLDFLI ELLHKDQLDETVNVE PLT 
KAI KYYOHLYS IHIAE0PEDCTM0LADHIKFT0SALDCMS VEVG 
RLRAFLQGGQE ATDIALLLRDLETSCS \ DIRQFCKKI RRRMPGT 
DAPGIPAALAFGPOVSDTLLDCRKHLTWWAVLQEVAAAAAQLI 
APLAENEGL L VAALEELAFKAS EQI YGTPSSSP YECLR0S CNI Ij 
I S TMN K\ XjVTAMQEGE YDAERP PSKP PP \VELRAAALRAE I TDA 
EGU?LIOjEDRETVIKEbKKSliKIKGEELSEANVRl>TliI.EKKLDS 

AMK.lJ/iiJr»K XC,j\v\J 1 ti.hH.ijl y>UjLiKAJVDA.n»rriJ-»l I llJJ\±J\jr\L> ± L/\JLi 

EAEKAEL KQRLNSQS KRT1 EGLRGPPPSGIATLVSGIAGEEQQH 
GAIPGOAPGSVPGPGLVKPSPLLLQQISAMRLHISQLOHENSIL 

LNQLSTHTHWDITRTS PAAKS PSAQLMEQVAQLKSLSDTVEKL 
KDEVLKETVSQR PGATVPTDFATFPSS& FhEAKEEQQDDTVYMG 
KVTFSCAAGFG0RHRLVLTQEOLH0LHSRLIS 


5377 


762 


1106 


DVPCKRVLPAEAQEKGOLTLSCGESGEEG\F*YHEVRQAEGES* 
/WFGPNVRLVHTQLKTKKPSGTLKAKFYLHTGSTKFAAR1SCTX 
SS * WPG YDGWKGGQ YI FI FRGMRWEEQP 


5373 


2009 


664 


QASGTTLRPLPDLPOLKRREATSRNRALKPRGRLVLMTSCLPAL 
RF I ATPRLS AMPH I DNDVKLD F KDVLLR PKRSTLKSRSE VDLTR 
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sequence 


Amino acid seonent containing signal peptiac 
(AeAlanine) C=Cysteine, D=Aspartic Acid, E = j 
Glutamic Acid, F=Phenylalanine, G^Glycine, t 
K=Histidine,. I = Isoleucine,' K=Lysine, | 
lj=Leucine, K=Methionine, N=Asparagine , ! 
P=Prcline, 0=Glutamine, R=Arginine, ' 
S^Serine, T=Threonine , V- Valine, | 
W=Tryptophan, Y^Tyroaine, unknown, +=Stop | 
Codon, /^possible nucleotide deletion, | 
\=possibie nucleotide insertion) ; 








SFSFRNSK0TYSGVPIIAANMDTV6TFEMAKVLCKS*VPGSFWD 1 
VPQMGCVFXIYKLFTLKWKMLLLSVLLPASILVAEKFSLFTAVK ! 
KH Y S LVQWQE F AGON P DCLE HLAAS SGTGS SDFEQLEQ I LEA1 F | 
QVK YI CLDVANGYS EH FVEF VKD VRKE FPQHTIMAGNWTGEM V j 
EELILSGADI I KVG 1 G PGSVCTTRXKTG VGYPQLSAVME CADAA 
HG L KGH1 ISDGGCSC PGD VAKA FG AGAD F VMLGGKLAGH SE £G G I 
ELlERDGKKYKLFYGMSS*I\AM\XKYAGGVAEYRASEGKrVEV j 
PFKGDVEHTIHD3 LGGIRSTCTY VGAAXLKELSRRTTFI RVTOQ ' 
VNPIFSEAC 


5379 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTS CLPAL | 
RFIATPRLSAMPHIDNDVKLDFKDVLLRPKRSTLKSRSEVDLTE 
S FS FRNS KQTY S GV P 1 1 AANMDTVGTFEMAKVLCKS * V PG S FWD 
VPOMGCVFL I Y KLFTLKWKKLLLS VLLPAS I LVAEKFSL FTAVH 
KH YS b VO WQ E F AGON P DC LEHLAAS SGTGS S D? EQLEQ I LEA I P 
0 VK Y I CLD VANG YS EH FV EFVK DVR KR FPOHTI MAGNW TG EMV 
EELILSGADI I KVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGH1 1 SDGGCS C PGDVAKAFGAGADFVMLGGMLAGK S ESGG j 
ELIERDGKKYKLFYGMSS*I\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFI RVTQO 
VNPIFSEAC ! 


5380 


2 


2050 


PS RAGG AERGRAAAARS PGGSAAGWECPSVLDEAGACTMSSCVS 
SQPSSNRAAPODELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
S Fl WTECE PGCAVDLGIARDRP LEADGQE VPLDTSGSQARPHL 
SGRKLSL0ERS0GGLAAGGSLDMNGRC1CPSLPYSPVSSPQSSP 
RLPRRPTVESHHVSITGMQDCVOLNQYTLKDEIGKGSYGWKLA • 
YNEWDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIOP 

RGPI \eqvyoeia\ilkkldkpnw\klvevl\ddpnedhlymv 

F\ELVNQGPVMEVFTLKPLSED0ARFYF0DLIKGIEYLHY0KI1 i 
H\RDIKPSNLLVGEDGHIKIADFGVSNEFKGSDALLSNTVGTPA 
FMAPE S LS E TRKI F SG KALDVW AMG VTLY CF VFG * C P FM DE R I M 
CLHSXI KSO ALE FPDQPD I AEDL KDLI TRNJLDXNPESR 1 W PE 1 
KLH PWVTRHGA2 PL P S EDENCTLVEVTEE EVENS VKK I PSLATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELXT*KISPLPACCXVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*FEPPRTDEALCPYETGRTCWAPLLQVLWWVGT?LPFPLSTSWL 
PDLVGAPGSHFCFLNIALLRYNSKTM 


53 Bl 


2 


2050 


psraggae kgraaa ar s pggsaagwecps vldeagactmss cvs 
sops snraapqdelggrg ssssesqkp ce alrglssls 1 hlgme 
s f 3 wtece pg cavdlglardrpleadgoevpldtsgsoarphl 
sgrklsl0ersogglaaggsldmngrcicpslpyspvssp0ssp 
rlprrptveshhvs i tgmqdcv0lnqytlkde1gkgsygwkjla 
ykendntyyamkvlskkklirqaafprrppprgtrpapggciqf 
rgpiVeovyoeiaMlkkldhpnwXklvevlXddpnedhlymv 
f\elvnqgpvmevptlkplsedqarfyfqdlikgieylmyqxi i ; 
h\rdi kpsnllvgedghikiadfgvsnefkgsdallsntvgtpa 
fkapeslsetrki fsgkaldvwamgvtlycfvfg*cpfmderik ; 

CLKSKIKSQALEFPDOPDIAEDLKDLITRMIiDKNPESRIWPEI j 
KLHPWVTRHGAE PLPSEDENCTLVEVTEEEVENS VKH1 PSLATV ! 
ILVKTMIRKRSFGNFFEGSRREERSLSAPC-NLLTKKPTRECESL ; 
SELKT * KI SPLPACCKVT* EFPH P SGCRPS CWQP P FLHTHS 0PR j 
* PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL j 
PDLVGAPGSHFCFLNI ALLRYNSHTM • 


5382 


1536 


203 


GARGSQODAPALQEAEVRGPERAQPARGRMTKARLFRLWLVx.GS | 
VFMILLIIVYWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE j 
LTADS DVDE FLDXFLS AGVKQSDLPRKETEQP PAPGSMEE SVRG 
YDWSPRDARRSPDOGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DI PNS ELSHL I VDDRHGAI YCYVPXVACTNWXRVM I VLSGSLLH 
RG A PY RDPLR I P R EHVHN AS AH LTFNK FWRR Y GKLS RJJ LM KVKL 
KKYTXFLFVRDPFVRLISAFRSKFELENEEF/*PQVRRAHAAAV 
RCPHQPARLGARGLPRWPQ\VSFANFIQYLIX>PHT3KLAPFNEK 
WRQVYRLCHPCQ I D YDFVGKLETLDEDAAQLLOLLQVDLAAPLP 
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nucleotide 
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to first 
amino acic 
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sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


-Amino acid seoment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidme, 3=lsoleucine, K=Lysint, 
L=Leucine, M=Methior.ine, N-Asparagint , 
P=Proline, OGlut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possib!e nucleotide insertion) 








PELPGTGPPSSWEEDWFAK1PIJVWRQ0LYKLYEADFVLFGYPKP 
ENLLRD 


5383 


45 


5250 


VERLLGCRNSKRTKRMLISKNMPWRR1QGISFGMYSAEELKKLS 
VKSITNPRYLDSLGNPSANGLYDLALGPADSKEVCSTCVODFSN 
CSGHLGHIELPLTVYKPLLFDKLYLLLRGSCIiNCHKLTCPRAVI 
HLLL CQLR V LEV G ALQA V YE LER I LS R FLE EN ADPS AS E I R EE L 
EQYTTEIVONNLLGSOGAHVKNVCESKSKLIALFWKAHMNAXRC 
PHCKTGRSWRKEHNSKLTITFPAIWHRTAGQKDSEPLGIEEAQ 
IGKRGYLTPTSAREHLSALWKNEGFFLNYLFSGMDEDGMESRFN 
PSVFFLDFLWPPSRSRPVSRLGDQMFTNGQTVNLOAVMKDWL 
3RKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSPLSTLPGQ 
SLIDKLYNIWIRL0SHVNIVFDSEMDKLMMDKYPG1ROILEKKE 
GL FR KH MMG KR VDYAAR S VI C PDMY I NTN E 3 G I PMVF ATKLT Y P 
OPVTPI'nWOELROAVINGPNVHPGASMVINEDGSRTALSAVDMT 
CREAVAKQbLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPTLH 1 
RPS I OAH RAR r LPEEKVLR LHYANCKAYNADFDGDEKNAHFPQS 
ELGRAE AYVLACTDCjQY LV PKDGQPLAGI/1QDHMV SG ASMTTRG 
CFFTREHYMELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQWS 
TLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSVPGFNPDSMC 
ESQVI I R EG E LLCG V LD KAHYGS S AYG LVKCCYE I YGGETS G KV 
LTCLARLFTAYLQLYRG FTL3VED I LVKP KADVKRQR 1 3 EESTH 
CGPQAVRAALNLPEAASYDSVRGKWQDAHLGKDORDFNM1DLKF 
KEEVNHYSNEINKACMPFGI.HRQFPENTLQLMVQSGAKGSTVNT 
MQ 2 S CI il. iGQ 3 ELE GR S T PLMASGKSLP CFE P YEFT PRAGG F VTG 
RFLTGI KPPEFFFHCKAGREGLVDTAVKTSRSGYL0RC1 1 KHLE 
GLWOYDLTVRDSDGSWQFLYGEDGLDIPKTQFLOPKOFPFLA 
SNYEVl MKSOHLHEVLSRADPKKALHHFRAI KKWQSKJ IFNTLLR 
RGAFLSYSQKI QEAVKALKLESENRJNGR/RPWDS/G/RMLRMWY 
ELDEESRRKYOKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 
YSQEWAAQTE KS Y E KS ELSLDRLRTLLQL\ KWQRS LCE PGEAVG 
LLAAOS I GEPSTOMTLNTFHFAGRGEMNVTLGI PRLRE3LMVAS 
ANI KTPMMS V P VLNTKKALKRVKSLKKQLTRVCI^EVLOKl DVQ 
ESFCMEEKQNKFQVYQLRFQFLPKAYYQQEKCLRPEDILRFMET 
RFFKLLMSS 1 KKKNNKASAFRNVNTRRATQRDLDNAGELGRSRG 
EOEGDEE EEGH I VDAEAEEGDADASDAKRKEKQEEE VDYES EEE 
EEREGEENDDEDMOEERNPHREGARKTOEQDEEVGL/GK*GGPV 

TEESLVJCQVTVKLPLMKINFDMSSLVVSLAHGAVIYATKGITRC 
LLNETTNN KNE KE LVLNTEGI NLPEL FKY AE VLDLRR L YSND I H 
A'i ANTYG I E AALRV I EKE I KDVFAVYG IAVDPRHLSLVAJDYMCF 
EGVYXPLNRFG I RSNSSPLQQMTFETS FQ FLKQATMLG SHDELR 
SPSACLWGKWRGGTGLFELKQPLR 


53 84 


196 


886 


QSCX50RLPTVL*L*GPPGSCPCIl>SLF\PGRPHAJjPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALESGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRGIYFFSLNVHSWNYKETYVHIMHNQKEAVILYAOPS 
ERSIMOSOSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITF 
SGHLI XAEDD 


5385 


326 


799 


LIWPRTKKEAJ>APPKAEAKAFAL\KAKKAVLKDVESHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRKKLDHHVIIKFPLTTE*A 
VKXI EMtfS LLVFTVDVKANKHQI KQAVKK / LCDIDVA K VWTL J Q 
SDGERKAY VR LA PDYDALWATK I GI T 


5386 


326 ■ 


799 


LMVPRTKXEAPAPPKAEAKAKAL\KAXKAVLICDVHSHKKKKIHNj 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKXIENNSLLVVTVDVKANKHQIKQAVKK/LCDIDVAKVm'LIQ 
SDG2RXAYVRLAPDYDALWATKIGIT 


5387 


2 


2117 


FWAASGGCW F VLGERRAGS LLSAS YGT FAMPGMVLFGRRWAXA 
SDDLVFPG FFELWRVLWWI G I LTLYLMHRG KLDCAGGALLSS Y 
LI VLMI LLAWI CTVSAI MCVSMRGTICNPGPRKSMSKLLYIRL 
ALFFPEMVWAS LGAA WVADGVQCDRTVVNG I IATVWSWI 1 1 AA 
TWS III VFDPLGGKMAP YSSAGPSHLDSHDSSQLLNGLKrAAT 
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Predicted 
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nucleot i de 
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corresponding 
to firs: 
amino acid 
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Predicted end 
nucl eotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alarine, C=Cysteine, D^Aspartic Acid, E=* 
Glutamic Acid, FePhenylalanine, G=Giycine, 
H=Histicine, I«Isoleucine, K=Lysine, 
L=Leucine, M^Methionine , N=Asparagine , 
P* Proline, Q^Glutamine, R=Axginine, 
S=Serine y T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWETR I KLLCCCI GKDDHTRVA FS S TAE LFSTYFSDTDLVPSD 
IAAGLALLHQQ0DNIRKNQZ3PAOWCHAPGSS0EADLDAELKNC 
HH YMQFAAAAYG W PLY1 YRNPLTGLCRIGGDCCR S KNPQTMT/M 
VGGDOL0L/CTSAFILHTHRAAV0GLKPROLPWTRFTELPFLVA 
LDHRKESVWAVRGTMSLQDVLTDLSAESEVLDVECEVODRIAH 
KGISQAARYVYORLINDGILSCAFSIAPEYRLVIVGHSLGGGAA 
ALLATM VRAA Y PQVR CYAFS P PRGLWS KALQE YSQS F3 VS LVLG 
KDVIPRLSVTNLEDLKRRILRWAHCKKPKYKILLHGLWYELFG 
GN P NNLPTELIX3GDQEVLTQ PLLGEQS LLTRWSPAYS FS S DS PL 
DSSPKYPPLYPPGRIIHLQEEGASGRFGCCSAAHYSAKWSHEAE 
FSK 1 L I GPKMLTDHMPDI LMRALDS WSDRAACVSCPAOGVS SV 
DVA 


5388 




753 


TADGGAGGGGRROAGVRRHYLYPFTGGYRRRRAACOAERPAARS 
KDTDUVAYQKGKLGVQLRNI4AQETNHSQVPMLCSTGCGFYGKPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEACSALDST 
SSSMOPSPVSNQSLLSESVASSQLDSTSVCKAVPETEDVOASVS 
DTAQQPSEEOSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTOMYTIALTXTKQMLKNFVFQQEFKSFGSFHCQLLEYK 
ILEHLQTKN 


5389 




753 


TADGG AG GGG R RQAG VRRH YLY P FTGG YRRRRAACO AE R P AARS 
KDTDloAAYOKGNLGVQLRNMAOETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAOSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVOASVS 
•>TAQQPSEE0SKSLE\NR2*KKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTOMYTlAIiTITKQKLKNFVFQOEFKSFGSFHOQbLEYK 
ILEHLQTKN 


5390 


21'/ 


1332 


EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREOLSKSTRDFI 
EGG ACDS 1 TRDDN I AAFKRIRLRPRYLRDVSEVDTRTT10GEE I 
SAF1 CIAPTGFHCLVWPDGEMSTARAAQAA\G1 CYITSTFASCS 
LEDIVIAAPEGLRWFQLYVHPDLQLNKQLICRVESLGFKALVIT 
LDTP VCGNRRHD I RNQLRRNLTbTDLQS PKKGNAI P YFQMTP I S 
TSLCVWDLSWFQSITRIjPIILKGILTKEDAELAVKHNVOGIIVS 

nhggrobdevlas i daltewaavkgki evyldggvrtgkdvlk 
alalgakciflgdailwalaskgehgvkevlwiltnefhtsmaX 

LTGCRSVAElNRNbVQFSRL 


5391 


- 


1292 


vkkaagrsrgpptaggqrceeapgtvmerrlgvrawvkenrgsf 
oppvcnkl.mliqeqbkvmfvggpntrkdyhieegeevfyolegdm 

VLRVLEQG KHRDWI RQGEI FLLPARVPHSPQRFAKTVGbWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTOLAPIIOEFFS 
SE0YRTGKPIPD0LLKEPPFPLSTRS1MEPMSLDAWLDSHHREI> 
QAGT P LS LFGDTY ETQV IAYG QG S SEGLRQNVD VW LWQLEG S S V 
VTMGGRRliSl»GPVJMDSL»LVLSVJGPSY\AW\ERTOGSVALSVT\Q 
DPACKKSPWGEPSCHGLKAATGVPSTLEVPSLPNNSPSPHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLKQrQPTAli 
PVLPGGLPPAPLLPIPLSLQTQCSTSTPRRPSIKAS 


5392 


3 


1623 


irgskaokwgasgsggagpopdpagpggvpaliaaavlgacepr 
caapcplpalsrcrgagsrgsrggrgaagsgdaaaaaew jrkgs 
f i hk pahg wlh p dar vlgpg vs y wr ymgci £ vlr smr s ldfnt 
rtqvtrea inrlheavpgvrgs wkkkapnkalas vlgksnlrfa 
gmsisihistdglslsvpatrqvianhhmpsisfasggdtdmtd 
yvayvaxdpinqrachii£ccegl\aqsiistvgoafelrfkqy 
lhsppkvalpperlagpeesawgdeedslehryynsipgkeppl 
ggbvds rlaltq pcaltaldqg ps pslrdacs lp wd vgstgtap 
pgdgyvoadargppdheehlyvntqgldapepedspkkdlfdmr 

PFEDALKLHELSVAAGVTAAPLPLEDQVJP5>t'FlKK/ii J Witty 

lroepwyhgrmsrraasrmlradgdflvrdsvtnpgqyvltgmh 

AGQPKK LL L VD PEG WRTKD VL FE S I S HLI DHHLQNGQ P I VAAE 
SELHLRGWSREP 


S393 




982 


GGDSAGMTMETOMSONVCPRNLWLLQPLTVLLLLASADSOAAAP 
P KAVLKLE PP WINVLQ\ EDSVTLTCQGAPQP / ERSDS I QW FHNG 
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BNSDOCID: <WO_0153312A1 J_> 



WO 01/53312 PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ct 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino ecic 
residue of 
amino acic 
sequence 


Amino acid seament containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histicine, I»Isoleucine, K=Lysine . 
L*» Leucine , M=Methiomne, W-Asparsoint , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threoni;ie, V=Valint, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LSEWLVLQTPHLEFQEGETI MLRCHS \ WRDKP\LVKVTFFQNGK 
SQKFSHbDPTFS 1 PQANHSHSGDYHCTGN1 GYTLFSSKFVTI TV 
QVPSMGSSSPMGIIVAWIATAVAAIVAAVVALIYCRKKRISAN 
STDPVKAAOFEPPGRQMIAIRKRQLEETNWDYETADGGYMTLNP 
RAPTDDDKN I YLTLPPNDH VNSNN 


S394 


4. 




GGDSAGMTMETQMSQNVCPRNLWLLOPLTVLLLLASADSQAAAP 
PKAVLKLEPPWINVLO\EDSVTLTCOGAPQP/ERSDSIOWFHNG 
\NLI PTHTQPb \ YRFKAJ>TtW\DSGEYTCOTGOTSL\SDPVHLW 
LSEWLVLCTPHLEFQEGETIMIiRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGN I G YTLFSSKP VTI TV 
QV P s MGSSS PMGT.I V A W I ATA V AA I V AA WAL I YCRKKR I SAN 
STDPVKAAQFEPFGROMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNIYLTLPPNDHVNSNN 


53 95 


3135 


533 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSIjQASDFDGAS 
E SGNPEAVALAPDAYSTGSS S ASSTLKRTKKPRP PS LKKXQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPFASGGGRVONSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGKSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRS PAEPNDI P 1 AKGTYTFD I DKWDDPNFN P FSSTS KMQ ESPKL 
PCOS YNFDPDTCDESVDP FKTSSKTPSS PSKS PAS FE1 PASAME 
ANGVDGDGLNKPAKKKKTPLXTDTFRVKKSFKRSPLSDPPSQDP 
TPAATPETPPVISAWHATDEEKLAVTNQKWTCMTVDLEADKQD 

V PQPSDLSTFVNETKFSS PTEELDY RNS YE 1 EYMEKI GS SLPQD 
DDAPKXQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNGHPVPRGLAPNQESHLQVPSKSSQKELEAMGLGTP 
SEA I EI TAPEGS FASAJDALLSRLAHPVS LCGALDYLEPDLAEKN 
P PL FAQ K LQR EAAH PTD VS I S KTALY SR 1 GTAEVE K PAG LLFQQ 
P DLDSALQI ARAE 1 1 TKEREVSEWKDKY EESRREVMEMRKI VAE 
YEKT I AQM I EDEORSKSVS \HQTVQQLVLE KEQA\LADLNS VEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAOEYLSRVKKEEOR 

Y QALKVHA\ EEKLDRAN AE \ I AQV RGKAQQEO AAHQASLAER S S 
CRV \DALERTLEOKNKEI E ELTK1CDEL I A KKGKS 


5396 


313S 


533 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
S SGNPE AVALA PDAYS TGS S S ASSTLKR TKKP RPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVONSPPVG 
RXTLPLTTAPEAGEVTPSDSGGQEDSPAKGKSVRLEFDYSEDKS 
S KDNOOEN P PPTK K I G KKP V AKM P LRR P KM KKT P E K1»DNTP AS ? 
PRSPAEPNDIP I AKGTYTFDI DKWDDPN FN PFS STS KMQESPK1. 
POOS YNFDPDTCDES VDPFKTSSKTPS S PS KSPA*SFE I PASAME 
ANGVDGDGLN KP AKKKKTP LKTDTFR VK XS P KRS PLS DP PS QD? 
T PAATPETPPV I S AVVHATDEEKLAVTNQKWTCMTVDLEADKQD 

Y PQ PSDLSTFVNE TKFS S P TEE LDYRNS YE I E YMEX I G SSLP QD 
DDAPKXQALYLMFDTSQESPVKSSPVKMSESPTPCSGSSFEETE 
ALVNTAAKNOHPVPRGLAFNQBSHLQVPEKSSOKELEAMGLGTP 
SEAIEI TAPEGS FA SAD ALLS RLAHPVS LCGALD YLE PDLAEKN 

PDLDSALQIARAEIITKEREVSEWKDKYEESRREVMEMRKIVAE 

Y EKT I AQM I EDEQREKS VS\HQTVQQLVLE KEQA\ LADLNS VEK 
\SIjADLFRRYEKMKEVLEGFRKNSEVLKRCAQEYLSRVKKEEQR 

yoalkvha\eekldranae\iaqvrgkaqqsqaahqaslaerss 
crvndalertleoknkeieeltkicdeliakmgks 


5397 


313S 


533 


ra.sdaknqegllntrrkstdsvpiskstlsrslslqasdfdgas 

CCf?MPPAVALAPnAV<5TGC:CCA<;€;TT.XR'^KXPRPPSLKKKOTTK 

kptetppvketqqepdeeslvpsgenlasetktesaktegpspa 
lleetplepaagpkaacpldsesvegwppasgggrvqnsppvg 
r xtlplttapeagevtpsdsggqedspaxghs vrlefpys edks 
swdnqqenppptkkigkkpvakmplrrpkmkxtpeklidntpasp 
prs paepndi piakgtytfdi dkwddpnfk p rsstskmqespkl 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/USU0/34263 



1 SEO 

1 ID 
i NC: 

i 


Predicted 
beginning 
nucleotide 
locatior. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to tirst 
amino acic 
residue of 
amine acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
K=Histidirte, l=Iso!eucine , K^Lysine, 
L=Leucine, M=Methionine, N*Asparagine , 
P=Froline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, VisValine, 
'.^Tryptophan, Y=Tyrosme, X=Unknown, *=stop 
Cocon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








PQCS YNFDPDTCDESVDPFKTSS KTVSS PSKSPAS FEI PA SAME 
ANG^GDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPA^.TPETPPVl S AWHATDEE KLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEELDYRNSYEIEYMEKIGSSLPOD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVKT AAKNQH P V P RG LAPNQE SHLQVP E KSSO KELEAMG LGTP 
SEA2EI TAPEGS FAS ADALLS RLAHPVS X.CGALDYLEPDLAEXN 
PPL'FAQKXjQREAAHPTDVS IS KTALYSR IGTALVEKPAGIjLFQQ 

pdldsalqiarreiitkerevsewkdkyeesrrewemrkivae 
yektiaomiedeokeksvs\hotvoqlvlekeoa\ladlnsvek 
\sliadlfrryekmkevlegfrkneevlkrcaoeylssvkkeeqr 
yoalkvha\eekldranae\iaovrgkaqqeqaahoaslaerss 

CRVNDALERTLEQKNKEIEELTKI CDELIAKMGKS 


S398 


St 




SGEVCRMESNFNOEGVPRPSYVFSADPIARPSEIKFDGIKLDLS 
HEFSLVAPNTEANSFBSKDYL0VCLRIRPFT0SEKELESEGO/H 
I LDSOT WLKEPQCI LGRLS EK SSG \ QM \ AOKFS FFPG FLG PAT 
TOKEFFQGCI MHP\ VKDLLKGQSRLI FTYGLTNSGKTYTFQGTE 
ENIRIbPRTLNVLFDSLQERliYTKMNLKPHRSRBYLRLSSEQEK 
EEIASKSALLROIKEVTVHNDSDDTLYGSLTNSLNISEFEESIK 
DYEOANLNMANSIKFSW^VSFFEIYWEYIYDLFVPVSSKFOKKK 
MbRLSQDVKGYSFl KDLQWI QV SDS K EAY RLLKLG 1 KHQSVAFT 
KLNN^SSRSHSIPTVKIUJIEDSEMSRVIRVSELSLCDLAGSER 
TMKTQKEGERLRETGNINTSIjLTLGKCINVLKNSEKSKFOOHVP 
FR E S K LTH YF/QS F FNGKGK I CM I VN I S QC YLA Y DETLNVLK FS 
Al AQKVCVPDTLKS SQBKLFGPVKSSQDVS LDSNSNSKILNVXR 
ATISWENSLEDLKEDEDLVEELENAEETED/VGETKLLDEDLDK 
TJjEENKAFTSHEEKRKLLDLIEDLKKKLINSKKEKLTLEFKIRE 
EVTOEFTOYWAOREAPFKETLLOEREl LEENAERRbAI FKDLVG 
KCDTREE AAXD1 CATKVETEEATA CbELKFNQI KAEbAKTKGEb 
I KTKEELKKRENESDSLIQELETSNKK J ITONQR I KEL INI IDQ 
KEDTINEFQNLKSHMENTFKCNDKADTSSLI INNKLI CNETVEV 
PKDS KS KI CSSRKRVNENELQODE PPAKKG S I HVSS AI TEDQKK 
S EE VR PN I AE I ED I RVLCjENNEGLRAFLLT I ENELKNEKEEKAE 
LNKOI VH FQQELSLSEKKNLTLS K EVQQ I QSNYD 2 AI AELHVQK 
S KN0E0 EE KIMKLSNEI ETATR S I TNNVSO I KLMKTK I DELRTL 
DSVSOISNIDLLNLRDLSNGSEEDNLPNTQLDbLGNDYLVSKQV 
KEYR10EPNRENSFHSSIEA1WEECKEIVKASSKKSHQIEELEQ 
0 1 E XLQAE VKG YKDENIORLKEKEK KNQDDLLKEKETLIQOLKEE 
LQEiGWTLDVOIQKVVEGKRALSELTQGVTCYKAKIKEJUETILE 
T0KVERSHSAKLEQD1 bEKESl 1 LKLERNLKEFOEKLQDSVKNT 
KDL WKEL KLKEEI TQLTNNLODM KHLLQLKEE EEETNRQ ETEK 
LKEELSASSARTQN\LNADLQRKEEDYADLKEKbTDAKK0IKOV 
QKEVSVMRDEDKLLRIKINELEKK}^OCSQELDMKOR\TIOQLK 
E0L1MQKVEEAI00YERACKDLNVKEKI I EDMRMTLEEQEQTQV 
EQDOVL \EAKLSE VERLATELDR WR VKCNDLETKNNQRSNKEHE 
NKTD VLG KLTNLQDELQES EQK YN ADRK KWLE E KMML I TOAKEA 
EN3RNKEMKKYAEDRERFFKQ0NEMEILTAQLTEKDSDLQKWRE 
ERDQL VAALE I QLKAL J S SNVQKDNE I EOLKRI I SETS KI ETQI 

riJ/j.KPJ^J.ooAJyrUAljv-i-t'* ajO J or l^ioKrt JVJ. tlJvjo V vjjlJoLJj V 

S TEND QS TR FPKPEDE I QFTPLOPN KMAVKH PGCTX P VTVKI P K 
ARKRKSNEMEEDLVKCENKKNATPRTNLKFPISDDRNSSVKKEQ 
KVAIRPSSKKTYSLRSQASIIGVI^LATKKKEGTLQKFGDFLQHS 

ISSPIDISGQVILMDOXMKESDKOHKRPLRTKTAK 


5399 


70S 


230 


GPRMAXFLSQDQINEYKECFSIjYDKQQRGKIKATDLMVAMRCLG 
7\cpTC GPVfYRin .OTHC;i DGNGF.bDFSTFLTl MHKOI KOEDPXKE 
I LLAMLMVDKEK KG YVMASDLR S K LTS LGE KLTH KEV \DDLFRE 
\ADIEPNGKVKYDSFIHKITSYLDGTY 


5400 


931 


248 


SKCSSG^IPPTNYPASRAAIjVAONYINYQOGTPHRVFEVQXVK 
QASMEDIPGRGHKYRI'KFAVEEIIOKQVKVKCTAEVLYPSTGQE 
TAP EVN FTFEGETG KN PDEEDNT FYQ R LKSMKE PLEAQN I \ PDN 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53317 



PCT/US00/34263 



SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ot 
amino acid 
sequence 


Predicted end 
nucleotidf 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ! 
(A=A2anine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, TV Threonine , V-Valinc, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\~pcssible nucleotide insertion) 








FGNVSPEMTLVLHLAWVACGY I IWONSTEDTWKMVKIQTVKQV 
ORNDDFIELDVTILLHNlASOEIIPWQMOVLWHPQyGTKVKHNS 
RLPKEVQLE 


5401 




1360 


tgwsygpttslaflaprdfpfppklllhpoawrlscgagsmgs 
oaaaewrnwaswegssslsgcsmgcfkddrivfwtwmfstyfme 
kwaprqddmlfyvrrklaysgsesgadgrkaaefevevevyrrd 
skklpglgdpdidweesvclnlilqkldymvtcavctradggdi 
hihkkksqqvfaspskhpmdskgeeskisypsiffm:dsf\ee\ 
vfsdmtvgkgemvcvelvasdktwtfc-gvi fqgs i ryealkkvy* 
dnrvsvaarmaqk\msfgf£kysnmef\vr\mkgpogkghaema 
vsrvstgdtspcgteedssfaspmhervtsfstpptpernnrpa 
ffspslkrkvprnriaemkkshsandseeffreddggadlhkat 

NLRSRSLSGTGRSLVGSWLKLNRADGNFLLVAHbTYVTLPLHRI 
LTDILEVROKPILMT 


5402 


3445 


1563 


GECF1MAAWQ0NDLVFEFASNVMHDERQLGDPAI FPAVI VEH V 
PGAD1 LNS YAGLACVEEPNDM I TESSLDVAEEE I IDDDDDDI TL 
TVEASCHDGDETIETlEAAEALLNHDSPGPMLDEKRINTmiFSS 
PEDDMWAPVTHVSVTLDGIPEVMETQQVOEKYAESPGASSPEQ 
PKRKKGRKTKPPRPDSPATTPNlSVKKKNKDGKGNTIYLWEFLb . 
ALLQ D KATCP KYI KWTQR E KG I FKLVDS KP V S RLWR KH KN KP \ D 
MN YE PMGRALRY YYQRG I IAKVEGQR LV YQFK EMPKDL 1 Y I NDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNS KAAKPKDPVEVAQPSEVLRTVQPTOS PYPTQLFRTVHWQ 
. P VQA VPEGEAARTSTMQDETLNSS VQS IR\ TI QAPTQVPVWS P 
RNQO\LHTVTLQrVPLTTVIASTDPSAGTGSOKFILOAIPSSQP 
MTVLKENVMLQSQKAGS PPSI VLG PARV\QOVLTSNVQTI CNGT 
VSV\ASSPSFS\ATAPWTIjFLU5SSQLVAHPPGTVITSVIKTQ 

etktltoevekkesedhlkentekteoqpqpyvmwsssngfts 
ovamkonellepnsf 


5403 


3445 

) 


1563 


GECF1 maawqqndlvfefasnvmederqlgdpai fpavi vehv 

PGADI LNSYAGLACVE E PNDM I TES SLDVAEEE I I DDDDDD I Th 
TVEASCHDGDETIETIEAAEALLNMDSPGPMLDEKRINNN1FSS 
PEDDNWAPVTHVSVTLDGIPEVMETQOVCEKYADSPGASSPEQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLODKATCPKY I KWTQREKGI FKLVDS KPVSRLWRKHKNKP \D 
MWEPMGRALRYYYORGII1AKVEGQRL.VYQFKEMPKDLIYINDE 
DPSSS I ESSDPSLSSSATSNRNOTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSFYPTQLFRTVHVVQ 
PVQAVPEGEAARTSTMQDETLNSSV0SIR\T10APTQVPVWSP 
RNCQ\LHTVTLQTVPL,TTVIASTDPSAGTGSQKFILQA1PSSQP 
MTVLKENVMLQSQKAGSP PS I VLGPARV\QOVLTSNVQTI CNGT 
VSV\ ASSPSFS \ATAPWTLFIjLGSSQLVAHPPGTVITSV1 KTQ 
E TKTLTQEVEKKE S EDH LKEN TEKTE QQPQP YVMV VS S SNG FTS 
OVAM KQNELLE PN S F 


5404 


187 

i 


1111 


LP VTL I FAKMKTLQ ST LLLLLL VPL I KPAP PTQQDS R 1 1 YDYGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI I PNEKS LQLQKDE AI 
TPLFPKKENDEMPTCLLCVCLSGSVYCEEVD1DAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTGNLIEDIEDGTFSKL 
SLVEELSLAENQLLXLPVLPPKLTLFNAXYNKIKSRGIKANAFK 
KLNNLTFLYLDHNALES VPLNbPE SLRVI HL0FNN IAS I TDDTF 
CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5405 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
I LS LDQI KAI RGSNE YTEGPS WKRPAPRTAPRQE KHERTHE 1 1 
PINVNNNYEHRHTSHbGHAVLPSNARGPILSRSTSTGSAASSGS 
KSSASSEOGLLGRSPPTRPVPGHRSERAIRT0PKOLIVDDLKGS 
LKEDLTOHKFICEOCGKCKCGECTAPRTLPSCLACNRQCLCSAE 
Sr4VEYGTCMCL\VKGI?YHCSNDDEGDSYSDNPCSCS0SHCCSR 
Y LCMG AMS L FLPCLLC Y P P AKG CLXLCRRC Y DW I HR PG CRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


i 275 


2732 


R WRT YNVEGPLTFMDVAI E FCLEE WOCLDTAOONLYRNVMLENY 
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WO 1)1/53312 PCT/U SOU/34263 



SEC 

in 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ct 
arr.ino acid 
sequenct- 


Amino scid secreent containing signal peptide : 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= j 
Glutamic Acic, F=?henyl alanine, G=GJycine , 1 
H=Histidine, 1 = 3 soieucine, K=Lysinc, \ 
L= Leucine, M=Kethionine , N=Asparagine , , 
P=Proline f G>Gl utamine , R*Arginine, 1 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop • 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) ; 








KNLVFLG/IIAVSKFDLITCLEOEKEPWEPMRRHEMVAKPPVKC i 
SHFTOOFMPEQHIKDFFOKATLRRYXNCEHKNVHUCKDHKSVDE : 
CXVHRGGYNGFNCCIjFATQSXI FLFDXCVKAFH KFSNSNRHKI £ 
HTEKKLFKCKECGKS FCKLSHLAQHKI IHTR VNFCKCEKCGKAF 
NCPSIITKHKRINTGEKPYTCEECGKVFNWSSRLTTHKKraYTRY 
KLYKCEECGKAFNKSS I LTTHKI 1 RTGEKFYKCXECAKAFNQSS 
NJ/TEHKKIHPGEKPYKCEECGKAFNWPSTLTKHX3IHTGEXPYT ! 
CEECG KAFNQFSNLTTKKR I HTA\ EKF YKCTECGEAFS R.S \S NL 1 
TKHKEIHTEKKPYKCEECGKAFKVJSSKLTEHKLTHTGEKPYKCE 
KCGKAFNCPSIITKHNRIKTGEKPYTCEECGKVFNWSSRLTTHK : 
KNYTRYKLYKCEBCGXAFNKSS1LTTHKKIH3EKXFYKCEECGK 5 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT j 
GEKPYKCEECGKAfTQSSNLTTKKKIHTGEKFYKCEECGKAFrC 
SSNLTTHKKIHTGGKPYKCEECGKAFWOFSTLTKHKIIHTEEKF ' 
Y KCEE CGKAFKW S STLTKH XI I HTGEXP YKCEE CG \ KAF KLS S7 
LSTHKI IHTGEKPYKCEKCGKAFNRPSNLIEHKK1HTGEQPYKC ' 
EECGKAFNYS SHLNTHKR IHTKEQ P YKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTV1LTTPOTFSWIK ; 


S407 


3 


655: 


RFRRROSSCCTGWbAGWLbRAAPRFCRRTETDMEQGKGLAVLIL 
All LLQGTLAQS I KGNHLVKVYDYQEDGSVL1/TCDAEAKNITWF 
KDGKM I GFLTEDKKKWNLGSNAKDPRGM YQCKGSQNKSKPLQVY 
YRMCQNCIELNAATI SGFLFAEI VS I FDLAVGV YFI AGTGMEFR 
OSNRASDKQTLLPXNDPAPTOPLKDPRKMTQYSHLQGNNQIjRRN i 


5408 


2*545 


612e 


OGSKGTCHPt3AO^PWDEGVWOEAPSQSEPWGQSOEPPTKPQRIiP ! 
HAROHTPJbPLGSADYRjRWSVRPQGPHRDPKDSRDAAKREQGSL 
APR P V P AS RGG KTLCKGY RQAP PG P PAQ FOR P 1 CSAS PPWASR F 
STPCPGGAVREDTYPVGTOGVPSLALAQGGPGGSWRFLEWKSMP 
RLPTDLD I GG P WFKH YE FE R SCWVRA3 SQEDQLATCWQAEHCG g 
VRNKDMS WPEEMS F I AN£ SKI DRHK VPTEKGATGLS NLGNTCFM j 
NSS1 QCVSNTOPLT0YF3 SGRHLYELNRTNPIGMKGHMAKCYGD i 
LVQELKSGTQKNVAPLKLRWT I AKYAPRFNGFQQQDSQELLAF L ■ 
LDGLHEDLNRVHEXPYVELKDSDGRPDWEVAAEAWDNHLRRNRS ; 
IWDLFHGQLRSOVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL j 
E I TV I KbDGTTPVR Y GLRLMMDEKYTGLKKQL/SDLCGLNSE QI L I 
LAEVHGSNIKNFPODKQKVRLSVSGFLCAFEIPVPVSPISASSP ! 
TQTDFSSSPSTNEMFTLTTNGDLPRPI FI PNGMPNCWPCGTEX ' 
NFTWGMVNGHMPSLPDSPFTGY I IAVKRKI4MRTELYFLSSQKNR 
PSLFGMPblVPCTVHTRKKDLYBAVWIQVSRLASPLPPOEASNH 
AQIXrDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFI GNA YI AVDWKPTALHLR YCTSQER WDEHES VEQSRR&Q j 
VSPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLVrR J 
LPPILIIHLKRPOFVNGRVJIKSQKIVKFPRESFDPSAFLVPRDP 
ALCQHKPliTPCJGDELSEPRILAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSFKGSPSSSRKSGTSCPSSKNSSPKSSPRTLGRS 
KGRLRLPQIGSKNKLSSSKENLDASKENGAGQICELADALSRGH 
VLGGSQPELVTPODKEVALANGFLYEHEACGNGCGNGYSNGQIiG 
NHSEEDSTDDQREDTR I KPI YNLYAISCHSGIU3GGHYVTYAKN 
PW CKW Y C YNDS SCKE LH PDE I DTDS AY I LFYEQOG I D Y AQ FLP K 
TDGiCKMADTSSMDEDFE5DY\EKYCVLO 


5409 


2745 


612fc 


OGSKGTCHPQAQOPV3DEGVWQEAPSOSBPWGQSQEPPTMPQRLP 
HARQHTPLPLGSADYRRWSVRPOGPHRDPKDSRDAAKREQGSL 
APRPVPASRGGKTLCKGYRQAPPGPPAQFQRPICSASPPWASRF 
STPCPGGAVREDTYPVGTOGVPSLALAQGGPQGSWRFLEWKSMP \ 
RLPTDLDI GGPWFPHY DFERSCVIVRAI SQEDQLATCWQAEHCGE 
VRNKDMSWPEEMSFIANS SKI DRHKVPTEKGATGLSNLGNTCFM 
NSSIQCVSNTQPLTQY FI SGRHLYELNRTNPIGMKGHMAKCYGD 
LVOELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFKGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
EITVIKLDGTrPVRYGLRLNMDEKYTGLKKOLSDLCGLNSEOIL 
LAEVHGSNIKNFPODNOKVRLSVSGFLCAFEIPVPVSPISASSP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Predicted end 
nucleotide 
location ! 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seament containing signal peptiae 
(AsAlsnine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»Glycint, 
H=Histidine, I=3soleucine, K«Lysine, 
L-Leucine, M*= Methionine , N^Asparagine , 
P=Proline, OGlutamine, R=Arginine, 
S=Serine, T~Threonine, V= Valine, 
w=Tryptcphan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 






TOTDFfSSPSTKEMFTLTTNGDLPRPIFIPNGMPNTWPCGTEK 
NFTNGMVNG] !M PS LPDS PFTGY 1 1 A VHRKMMRTELY FLS S OXNR 
PS LFGM ? L3 V P CT VK TR KKDLY DAVW I Q VS R bAS PLP PQEASNK 
AQDCDDSMGYOYP FTLRWOKDGNS CAWCPWYRFCRGC K I DCGE 
DRAF 1 G?C AY 3 AVDWH PTALHiiRY QTSQ ERWfrEHE SVE Q S RRAQ 
VEPlNbDSCLRAFTSEEEbGENEMYYCSXCKTHCLATKKLDLWR 
LPPIL1 1 HLKRFOFVNGRWI KSQX1VKFPRES FDPSAFLVPRDP 
A3UCQHK P LT PQGD ELS E PR I LAREVKKVDAQS S AGEEDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRbRLPQ IGS KNKLSS S KENbDAS KENGAGQ I CE bADALS RGH 
VU3GSOPELVTP0DHEVALANGFLYEHEACGNGCGNGYSNGOUG 
NHSEEDSTDDOREDTRIKPIYNLYAISCHSGIIiGGGHVVTYAKN 
PNCKWY C Y NDS SCKELH PDE I DTDSAY I LFYEQQG I DYAOFLPK 
TDGKKM7J)TSSMDEDFESDY\EKYCVLQ 


5410 


2 


710 


LRFPG0ARHVW1AARM0APHKEHLYKLLVIGDLGVGKTSI I KRY 
VHONFSGrlYRATIGVDFALKVLHWDPETWRLQbWDIAGOERFG 
NMTRVY YKEAMGA FI VFDVTRPATFEAVAKWKNDLDSKLSLPNG 
KPVSVVLLANKCDOGKDVLMNNGLKMDQFCKEHGFVGWFETSAK 
ENIKH5EASRCLVKHI1ANECD1H2SIEPDWKPHLTSTKVASC 
SG\CAJU LVGTFAGVW 


5411 


1302 


285 


TGPAAAGRRKALGSFGKPSPVTGLRAARRPJRTRPSAPAAPSVGC 
G KRRES D AG AGG E RAS VRTG SGRRGG RTMAGDS EQTLQKHQO ? N 
GGEF FL 3 G VSGG TASGKS S VCAK I VQ bLGQNEVDYRQKQ WI LS 
ODSFYRVLTSFOKAKALKGOFNFDHPDAFDNELILKrLKElTEG 
KTVOIPVYDFVSHSRKEETVlVYPADWLFEGItAFYSOBR/IR 
DLFOMKLFVDTDADTRLSRRVLKDISERGRDLEOILSHSTLRFV 
KPA\FEE FCLPPKX KYADVI I PR\GADN\RVPI NLI VCH3 Q\ D I 
1,NGGPS\NR0TNGCLNGYTPSRKRQASESSSRPH 


5412 


3180 


313 


QGISNFFK KEAN FW FEVSGYLI SPLRS PFVDFALEWSbMAS PWN 
XMEGESSRFEIKTPVSDKKKXKCSIHXERPQKHSHEIFRDSSLV 
NEQSQITRRKKRKKDF0HL1SSPLKKSRICDETANATSTLKKRK 
KRRYSALEVDEEAGVTWLVDKENIKNTPKHFRKDVDWCVDMS 
I EQKLPR K V PKTDK r OVLAKSH \AHKS EALHSK VREK KNKKHQR 
KAASWESORA\RDTLPOSEFPTQEESWLSVGPGGEITELP\ASA 
RKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDM0ESOPTV 
GLDDETPOLLGPTHKKKSKKKKKKKSNHQEFBSLAMPEGSOVGS 
EVGADMQES\RPAVGLHGETAGtPAPAYKNKSKKKKKKSNHOEF 
EAVAMPESLESAYPEGSOVGSEVGTVEGSTALKGFKESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 
VKSRPROKKTQACLASKHVOEAPRLEPANEEHNVETAEDSEIRY 
LSADSGDADDSPADLGSAVKQ1.QEFIPNI KDRATST1 KRMYRDD 
LERFKEFKAQGVAIKFGKFSVKENKQLEKNVEDFLALTGIESAD 
KLLYTDR Y P E EKS VI TNbKRRYSFRLK I G \ RNI AR PWXLI YYRA 
XKMFDVNN Y KGR Y S EGDTEKLKMYHSLLGNDW KT3 GE M VARRSL 
SVAIjKFS0ISS0RNRGAWSKSETRKLIKAVEEVIIjKK14SP0ELK 
EVDSKLOENPESCLSrVREKLYKGISWVEVEAKVQTRNVJMQCKS 
Ktf TEILT KR MTNGRR I Y YGKNALRAKVS L I ERbYE I MVS DTNEI 
DWEDLAS A I GDVP PS YVQTKFSRLKAV YVP FWQKKTFPE I IDYL 
YETTLPLLKE KLE KMMEKKGTKIQTPAAPKQVFPFRDI FYYEDD 
SEGGGHR KRKRRPRRHAWFTPVI PVLWEAKAGWI 1 


5413 


3753 


1304 


RFPAGVA PR RAMAKVSKKVSWSGRDRJDDEEAAPIjLRRTARPGGG 
TPLLNGAGPGAAROSPRSALFRVGHMSSVKLDDEIjLEP\DMDPP 
HPFPKElPKNSKLLSLKYESLDYDNSENQuFLEEERRINHTAFR 
TVEI KRWV I CALIG ILTGLVACFIDI WEN LAGLKYRV I KGN1D 
KFTEKGGLS FSLLLWATLNAAFVbVGSVIVAFI EPVAAGSGI PQ 
IKCFLNGVKI PHWRLKTLVI KVSGVI LS WGG LAVG K EG PM 3 H 
SGSVIAAG I SQGRS TS LKRDFKI FEYLRRDTEKRDFVS AGAAAG 
VSAAFGAPVGG VLFSLEEGASFWNQFLTWR I FFASM3 S TFTLNF 
VLSI YHGNM WDLSS PGbl NFGRFDSEKMAYTIHEI PVFI AMGW 
GGVLGAVFNALNYWLTMFRIR YI HRPCL0V1 EAVLVAAVTATVA 
I^IiIYSSRDCOPl^GGSMSYPLOLFCADGEYNSMAAAFFNTPEK 
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SEC- 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predictec end 
nucleotide 
location 
corresponcing 
to first 
amino acic 
residue cf 
amino acic 
sequence 


Amine acid segment containing signal peptide 
lAsJttanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=r?henylalanine, G=Glycine, 
H=Kistidine, I«Isoleucine , K^Lysine, 
L=Leucine, NteMethionine, N^Agparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion} 








SWSLFHDPPGSYNPLTLGLFTLVYFFLACWTYGLTVSAGVFIP 
SLLlGAAWGRLrGISLSYOTGAAIWADPGKYALMGAAAOLGGlV 
RMTLSMVIMMEATSNVTYGFPI MLVLMTAKI VGDVF1 EGLYDM 
HIOLQSVPFLHWEAPVTSHSLTAKEVMSTPVTCbRRREKVGVIV 
DVLSDTASNHNGFPWEHADDTQPARLCGLI LRSQLI VLLKHKV 
FVERSiMLGLVQRRLRLKDFRDAVPRFPPlOSlHVSODERECTKD 
LSEFMNPSPYTV PQE AS L PRV PKL»FRALG1»RHLWV7DN RN QWG 
LVTRXDLARYRLGKRGLESLSLAOT 




2130 


390 


GVASAWDRALFSPLLSPTSRVFRTSPPRCVSTETGRRDRARVPS 
QWCSVLOGKliPVSGRTSliACVRSILLSPASSPRKVGTVGGTGAR 
AGAAPRDHGRYRHRRPSSARRMTRTTGQCtiAPRGCOGPRGTRSP 
RSPRSRTRRGCSASPACbP/CRSALIVAVLCyJNIiNYMDRFTV 
AGVIiPDIEOFFNIGDSSSGLIOTVFISSYMVLAPVFGyLGDRYN 
RKYLMCGGIAFWSLVTLGSSFIPGEHFWLLLLTRGLVGVGEASY 
STIAPTLIADLFVAD0RSRMLS1FYFAIPVGSGLGYIAGSKVKD 
MAGPV7KMALRVTPGLGWAVLLLFLWREPPRGAVERKSDLPPL 
NPTSWWADLRALARNPSFVjSSLGFTAVAFVTGSLALKAPAFLL 
RSRWLGETPPCLPGDSCSSSDSL1 FG L I TCLTG VLGVGLG VEI 
SRRLRHSrJPRADPLVCATGLLGSAPFLFLSLACARGSlVATYIF 
IFIGE7LLSMNWAIVAD3LLYVV1PTRRSTAEAFQIVLSKLLGD 
AGSPYLIGLISDRLRRNWPPSFLSEFRALQFSLMLCAFVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


IPFKTkLSLQKH\LTTLT\KQEGATIFEEVQKLRPRWEQRENEL 
USFLRCLFEEKQKEHIHIGEMKQTSQMAAENIGSELFPSATRF 
RLDMLKNKAKRSLTESLES ILSRGNKARGLQEHS I S VDIrDSSLS 
STLSNTS KEPS VCE KEALP I SES S F KLLGSSEDLS SDSESHLPE 
E PAP LS PQOAFR RRANTLS H ?P I ECQE ? PQPAR GS PGVSO R KLM 
RYHSVSTETPKE'RKnFESKANHLGDSGGTPVKTRRHSWRQQIFL 
RVATPQ KA CDS S SR Y EDY S ELGELP P R S PLEPV CEDGPFG P PPE 
EKKRTSRELRELWQKAILQQILLLRNEKENOXLQASENDLLNKR 
LKLDYEEITPCLKEVTTVWEKMLSTPGRSKIKFDMEKWHSAVGQ 
GVP\RHHRGEIWKFLAEQFHliKHCFPSKQQPKDVPYKELLKOLT 
SQOHAILIDLGRTFPTHPYFSAQLGAGQLSLYN1LKAYSLLDQS 
VGYCOGLSFVAGILbLHMSEEEAFKMLKFLMFDMGLRKQYRPDM 

I IbOl omyolsrllhdyhrdlynhleeheigpslyaapwfltmf 

ASQFVLG FVAR VFDMI FLQGTE VI FKVALSLLG SHKPLILgHEN 
LET J VDFI KSTL PNLGLV0MEKTINQ VFEMDI AKQLQAYE VE YH 
VL0EEL 3 DSS PLS DNQRKDKLEKTNSSLRKONLDLLEOLQVANG 
RIOSLEATIEKLLSSESKLKQAMLTLELERSALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KS01>F CFWGGKAGDI LSGDODKEOKDPYFVETPYGYOLDLDFLK 
TVPDIOKGNT I KRLNI QKRRKPSVPCP EPRTTSGQQG I WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTP1SKPPPPLETSLPFLT1P 
ENROLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMOMTFGEF 
RRPR LASFGGMG TTSSLPS FVGSGNHNPAKHQLCNGYQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKROLVSOLKNORAA 
SQI NVCGVRKR S YS AGNASOLEQLSRARRSGGELY I DYEEEEME 
TVEOSTQR I KEFRQL\TADMOALEQKIQDSSCEASSELRENGEC 
RSVAVGAEENMNDIWYHRGSRSCKDAAVGTIiVEMRNCGVSVTE 
AMLG VMTEADKE IELOOQT I ESLKEK I YRliEVQLRETTHDREMT 
KLKOELQAAGSRKKVPKATMAQPiiVFSKVVEAVVOTRDOMVGSH 
MDLVDTCVGTSVETNS VG3 S CQPECKNKWGPELPMN WW I VKER 
VEMHDRCAGRSVEMCPKSVSVEVSVCETGSNTEESVNDLTLLKT 
KLHIjKEVRSIGCGPCSVDVTVCSPKECASRGVNTEAVSQVEAAV 
MAVPR7AD0DTSTDLEQVHGFTNTETATLIESCTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
PSAVXTKESGVGQININPNYLVGLKMRTIACGPPOLTVGLTASR 
RSVGVGDDPVGSSLENPQPQAPLGMMTGLDHYIERIQKLLAEOQ 
TLLAE^SELAEAFGEPHSQKGSLNSQLISTLSSINSVMKSAST 
EELRNPDFOKTSLGKITGSYLGYTCKCGGLOSGSPLSSQTSOPE 
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SEQ 
ID 
NO: 


Predictec" 
becinnmc 
nucleotide 
location 
corresponding 
to firsi 
amino acic 
residue of 
amdno acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide ! 
(A=Aisnine, OCysteine, D=Aspartic Acid, E= 
Glutamic Ac.i.n, r^Phenylalanine, G~Giycine, 
H=Histicine, l=Isoleucine, K=Lysint. 
l»=Leucine, M= Methionine , N=Asparapine, 
P= Proline, G=Glutamine, R»Arginine, 
S-Senne, ^Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








OEVGTSEGKPISSLDAFPTOEGTLSPWLTDDQIAAGLyACTriW 
ESTLKS 1 MKXKDGNKDSNGAKKNLQFVG1 NGGYETTSSDDSS SD 
ESSSSESDDECDVIEYPLEKEEEEEDEDTRGMAEGHHAVNIEGI* 
KSA^VEDEMOVOECEPEKVEIRERYELSEKMLSACNLLXNTINP 
PKALTSKDMR FCLNTLOHEWFR VSSQKSA I PAWVGDYIAAFEAI 
S PDVLR Y V I NLABGNGNTALHY SVSHSNFE I V KLLI »DADVCNVD 
HQNKAGY TP 1 MIAALAAVEAEKDMRI VEELrGCGDV^KAKASOAG 
OTALM^VSHGRIDMVKGLLACGADVNIOBDEGSTALMCASEHG 
HVEIVKLLl^OPGCNGHLEDNDGSTALSIALEAGHKDIAVbLYA 
HVNFAXAQSPGTPRLGRKTSPGPTHRGSFD 


5417 




4074 


KS0L>FCFWGG ■CAGDILSGDQDKEQKDPYFVETPYGYQIjDLDFIjK 
YVDDICKGNTIKRbNIQKRRKPSVPCFEPRTTSGCOGrWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPlSKPPPPLETSLPFLTIP 
ENROLPPFSPOLPKHNLHVTKTLMETRRRLEQERATHOMTPGBF 
RRPRLA.S FGGMGTTSS LPSFVGSGNHT^PAKKQLQNGYQGNGDYG 
SYAPAAP TTSSMGSSI RH3PLSSGISTPVTNVS PMH LQH I REQM 
AIALKRLKELEEOVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
S0IWVCGVRKR5YSAGNASCLEOLSRARRSGGELY3DYEEEEME 
TVEOSTOR1KEFRQL\TA3MQALEQKIODSSCEASSELRENGEC 
P. S VAVGAEENMNDI WYHRGSR S CXDAAVG Tl# VEMR N CG VS VTE 
AM LG VWT EADKE I E LQQQTIE S LKE K3 Y R LE VQLRE TTHDREMT 
KLKOE LOAAGS RKKVDKATMAQ PLVFSKVVE AWQT RDQtW GSH 
MDhVDTC VGTS VETNS VG I SCQPECKNKWGPELPMNWW I VKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKE VRS 3 G CGDCS VD VTVCS P KE CAS RGVNTEA VS Q VE AAV 
MAVPRTADQDTS TDLEQVHQFTNTETATLI ESCTNTCLSTLDKQ 
TSTOTVETRTVAVGEGRVKDIKS S TKTRS I GVGTIiSGHSGFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQI.TVGLTASR 
RS VGVGDDPVGES LENPQPQAPLGMMTGLDH Y I ER1 QKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQL1STUSSINSVMKSAST 
EELRWPDFOKTSLGKITGSYLGYTCKCGGLOSGSPLSSQTSQPE 
OBVGTSEGKPISSLDAFPTQEGTLSPVNLTDDOIAAGLYACTNN 
ESTLKS 3 MKKKDGNKDSNGAKKNLOFVG INGGYETTS SDDSSSD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDE«QVOECEPEKVEIRERYEI,SE^iLSACNLI>KNTIND 
PKALTSKDMRFCLNTLOHEWFRVSSQKSAI PAMVGDYIAAFEAI 
S PDV LR YV I N LADGNGN T ALHY SVSHSNFE3V KLLLDADVCNVD 
HONKAG YTF J MLAALAAVEAEKDMRI VEELFGCGDVNA KASQAG 
0TALMLAVSHGRIDMVKGLLACGADVNIODDEGSTALMCASEHG 
HVE I VKLLLAQPGCNGHLEDNDGSTALS I ALEAGHKDIAVLLYA 
HVNFAXAQS PGTPRLGR KTS PGP THRG SFD 


5418 


24 


1133 


SVPRAGGDKETGAAEIjY DQALLG I LOH VGNVODFLRVLFGFLYR 

ktdfyrllrhpsdrmgfppgaaoalvlqvfktfdhhiarqddekr 
rqkleekirrkeeeeaktvsaaaaekepvpvpvqeieidsttel 
dgkoevekvqppgpvkemahgsoeaeapgavagaaevpr\ep?i 
lpriqeofqknpdsyngavrenytwsodytdlevrvpvpkhwk 
gkqvs valssss i r vamleengervlm3gklthkintesslwsl 
e pgkcvlvnls kvge ykwnai legeep i didkinkersmatvde 
eeqavldrltfdyhqklqgkpqshelkvheklkkgwdaegspfr 

GQRFDPAMFNISPGAVQF 


5419 


1251 


259 


gthpldpdlvsrtsvg^plmtmacpgmsdteespflgpraaeeg 
seseaceafgrrkseeegrrsdtsgfgrsrkhkvnwkhperada 
kdpaslpqc/lgp/lx^rpaqpsskycsddcgmklaanriyeiii 
por 3 q0wq0s p ci asehg kkdler i r reqqsartrlqemerrfh 
eleaiilrakqoavredeesnegdsddtdloifcvscghpinpr 

VALRHMER CYA KYESQTS FGS MYPTR 3 EGATRLFCDVYNPQSKT 
Y CKRLOV LCPEHSRDPKVPADEVCGC PLVRDVFELTGDFCRLPK 
RQ(^RHYCWEKLRRAE\mLERVRVwyKLDELFEQERNVRTA>lTN 
RAGI.IALMLHQT1 QHDPLTTDLRSSABR 


5420 


111 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFL-PARPPPJ^ORHLGFSHAEQSMEAPDYEVLSVREOIjFHERIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond: ng 
to first 
amino acic 
residue of 
amino acic 
sequence 


Ami no acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acic, F= Pheny 3 alanine , G=Giycine, 
H^Kistidine, 3=3 soleucine , K=Lysine, 
L=Leucine, M^Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, 7-Threonine, V=Valine, 
K=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion) 








EC1 1 STLLFATLYILCHIFLTRFKKPAEFTT\GM«KMPPSTRL/ 
LLELCTFTLA: ALGAV LLLPFS 1 1 SKEVLLS LPRNYY I QWLNGS 
LIHGLWNLVFLFSNLSLIFLMPFAYFFTESEGFAGSRKGVLGRV 
YE7VVMU^LIiTLL\^MVWV7iSA3VDKNXANRESbyDn«EyyiiP 
YLY S C I SFLGVLLLLVCTPLGLARMFS VTGKLLVKPRLLEDLEE 
OLYCSAFEEAALTRRICNPTSCWLPLDMELLHROVI.ALQTQ^VL 
LEKRRKASAW0RNLGYPIAMLCLLVLTGLSVLIVAIH1LELLID 
EAAM PRGMQGTSLGQVS FSXLGSFGAVI QWLI F YLMVSS WGF 
YSSPLFRSLRPRWHDTAMTOIIGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGR FN WLGN FY I VFLYNAAFAGLTTLCLVKTFTAAV 
RAELiI RAFGERE 


5421 


117 




NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVViVLGG 
GGFLPARPPRAQRHLGFSHAEOSMEAPDYEVLSVREOLFHERIR 
EC1 3 STLbFATLYI LCH 3 FLTR F KKP AEFTT\GMM KM P P S TRb/ 
LLELCTFTLAI ALGAVLLbPFS 1 3 SNEVLLSLPRNYYIQWLNGS 
LIHGLWNLVFLFSNbSL3FLMPFAYFFTESEGFAGSRKGVLGRV 
YETWMbMbbTLbVbGKVWVASAIVDKNKANRESLYDFWEYYLP 
YLYS C 1 SFbGVbbbbVCTPLGbARMFSVTGKLLVKPRbLEDbEE 
QLYCSAFEEAALTRRICNPTSCWbPLDMELbHRQVbALQTQRVIi 
1»EKR RKAS AMQRWLG Y PLAMLClibVL-TGLiS VLI VAI H 1 LEbL I D 
EAAMPRGMQGTS LGQVS FSKLGS FGAV I QWL1 FYbMVSS WGF 
YSSPLFRSLRPRWHDTAMTQI I GNCVCbbVLSSALPVFSRTLGL 
TRFDLLGDFGR FN WLGN FYIV FbYNAAFAGLTTLCbVKTFT AAV 
RAELI RAFGERE 


5422 


3 


1263 


SCGESbPTWbAGASRPGlGRKGGAWGGRGGSSPAOVbbSPGPVF 
KAGCMWWHLSRDOAGVORCDLGSSOPPPLGFKRFSCLSLPSSWD 
YRST^CVSKMEADLSGFNIDAPRWDORTFLGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKSRMGWPPGTQVEQJ>T»YAKKLYDSAF 
HPDTGEKMNV3 GRMSFQLPGGMI ITGFMLQFYRTMPAVI FWQWV 
NQSFKALVNYTNRNAAS PTS VRQMALS Y FTATTTAVATAVGMNM 
LTKKA P PLVGRVJ V P F AAV AAAN CVN 3 PttMRCOELI KGICVKDRN 
ENE3 G HSRRAAA I G I TQWISR I TMS APGMI LL PV I MERbEKLH 
FMQKV K Vb/ SAP LQVMbSGCF b I FN V P V ACGLF PQKC ELP VS YL 
EPKLODTI KAKYGELEPYVYFNKGL 


S423 


3186 


905 


GVSMALGEEKAEAEASEDTKAOSYGRGSCRERELDI PGPMSGEQ 
PPRL-EAEGGLIS PVWGAEG I PAPTCW3 GTDPGGPSRAHQPQASD 
AMREPVAERSEPALSGbPPATMGSGDLLbSGESQVEKTKLSSSE 
EFPCTLSbPRTTlCSGHDADTEDDPSbADLPQALDLSQQPHSSG 
LSCL S OWKSVLS PGSAAGPSSCS 3 SASSTGSSLQGHQERAEPRG 
GSLAKVS SSbEP WPQEPS S WGLGFR PQMSPQPVFSGGDASGb 
GRRRLSFQAEYWACVLPDSLPPSPDRHSPbWNPNKEYEDLLDyT 
Y PbR PG PQLP KKLDS R VFADP VbODSG VDLDS FS VS P ASTLKS P 
TKVSFNCPPAFA'J'AbPFSGPREPSbKQWPSRVPQKOGGMGbASW 
SOLASTPRAPGSRDARWERREPALRGAKDRbTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSOSARRPTCTESRWKSEEEVESDDEY 
bALFARbTOVSSLVSYLGS 3 STbVTbPTGDI KGOS PLEVSDSDG 
PASFFSSSSQS0LPPGAA1QGSGDPEGQNPCFLRSFVRAHDSAG 
EGSbGSSOAbGVSSGLLKTRPSbPARLDRWPFSDPDVEGQLPRK 
GGECGKESbV0C\VKTFC\C0LEELICV71iYNV\ADVTDHGTPAR 
SN LTS LK \ SSLQL Y RO? KKD 3 DEHQS LTES VLQKG E 3 LLQCLLE 
NTPVLEDVbGRIAKQSGEbESHADRbYDS I LASLDMLAGCTLI P 
DKKPKAAMEH PCEGV 


5424 


3186 


905 


GVS^XGEEKAEAEASEDTKAOSYGRGSCRERELDIFGPMSGEQ 
PPRbEAEGGbISFVWGAEGIPAPTCW3GTDPGGPSRAKQPQASD 
AWREPVAERSEPAbSGbPPATMGSGDLbLSGESOVEKTKbSSSE 
EFPOTbSbPRTT3CSGKDADTEDDPSlADL?0ALDLSO0PHSSG 
LSCbSOWKSVLSPGSAAQPSSCSlSASSTGSSbOGHQERAEPRG 
GSLAKVSSSbEPVVPOEPSSVVGLGPRPOWSPOPVFSGGDASGb 
GRRRL2FQAEYWACVLPDSLPPSPDRHSPLWNPNKEYEDLLDYT 
YPLRPG PQLPKHLDS R V F ADP VLQDSGVDLDS FS V S PASTLKS P 
TNVS PN CP P AEAT AbP FSGPREP SbKQW P SRV PQKQGGMG LAS W 
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BNSDOC1D. <WO 0153312A1 J_> 
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SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to firs; 
amino acic 
residue oi 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Ammo acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine , G«Glycine, 
H=Histidine, J^Isoleucine, K^Lysine, 
L^Leucine, M=Methionine , N=Aeparagine, 
P^Proline, CNGlutamine, !t=Arginine, 
S=Serine, T-Threonine, V=Valine f 
W^Tryptophan, Y=Tyrcsine, X=Unknown, *cStop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


i 






sqlastprapgsrdarwerrepalrgakdrltigkkldmgspqi, 
rtrdrgwpsprperekrtsosarrptctesrwkseeevesddey 
lalparltqvss lvs ylgsi s tlvtlptgdi kgqs plevsdsdg 
pasfpssssqsqlppgaal0gsgdp2g0npcflrsfvrahdsag 
egslgssoalgvssgllktrpslparldrwpfsdpdvegqbprk 
ggeqgkeslvocWktfcVcoleelicwlynvXadvtdhgtpar 

SNLTSLKNSSLQLVRQFKXDIDEHOSLTESVLQKGEILLQCLLE 
NTPVLEDVLGRIAKQSGELESKADRLYDSILASLDl^LAGCTLIP 
DKKPMAAKEHPCEGV 


5425 

! 

i 


ioefc 


115 


GFCPS P SLGHQ P PR VLHPTMSMA V ETFGF FMAT VGLLMLG VTL.P 
NS Y WR VSTVHGNVITTNT I FENLW FS CATDS LGVYNCWE FPSML 
ALSGYIOACRALMITAILLGFLGLLLGIAGLRCTNIGGLELSRK 
AKLAATAGAPH \ I LPG 1 CGMVA I \ S W YA FN I TR \ DFS DPLYPGT 
KYELGPALYLGWSASL3 S ILGGLCLCSACCCGSDEDPAAS ARRP 
YQAP VS VM PVATS DQEGDS 3 FG KY GRN ALRVAAL»CRGPRCI*PTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLI F 


5426 

i 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDOPSAPSDPTDQP 
PAAHAKPDPGSGGOPAGPGAAGE AIiAVLTS FGRRLLVLI PVYLA 
GAVGLSVGFVLFGIiAbYliGWRRVRDEKERSbRAAROLLDDEEQL 
TAKTLYMSKRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYKEK 
LLAETVAPAVRGSNPHLQTFTFTR VELGEKPLR 1 IGVKVHPGQR 
KEQI LLDLNI SYVGDVQIDVEVKKY FCKAGVKGMQLHGVLRVIL 
EPL IGDLP FVGAVSMFF I RR PTLD 1 NWTGMTNLLDI PGLSSIiSD 
TMI MCS 1 AAFLVLPNRLLV? LVPDLQDVAOLRS PL PRGI I R I HL 
LAARGLSS KD KY VKGLI EG K SD P Y Al> VRLGTQTF C S R V 1 DEELN 
POWGETYEVMVHEVPGQEIEVEVFDK0PDKDDFLGRMKLDVGKV 
LOAS VLDD WF PLQGGQGQ VH I iR L E W hSLLSDAEKLEQVLQWNWG 
VS SRPDPPSAA1 LVWLDRAODLPMVTSELY P PQLKKGNKEPNP 
MVQLSIODVTQESKAVYSTNCPVWEEAFRFFLQPPQSQELDVQV 
KDDS RALTLG ALTLPLARLLTAP EL X LDQ WFQ LS S3G P NS RL YM 
KLVMRI LYLDSSEI CFPTVPGCPGAVJDVDSEWPQRGSS VDAPPR 
PCHTTPDSOFGTEHVXiRIHVLEAODLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNFRWNEVFEVIVTSVPGQELEVEVF 
DKDLDKDDFLGRCKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 
ERLTPR PTAAELEEVbQVH SXil QTOKS AEIAAALLS I YMERAED 
LPLRKGTKHLSPYATL.TVGDSSHKTKTISQTSAPWIDESASFLI 
RKPHTES LELQ VRG EGTGVLG S LS LPLSELLVADQLCLDRWFTL 
SSGQGQVLLRAQLGILVSOH£GVEAHSHSYSHSSSSLSEEPELS 
GGPPHITSSAPEV\RORLTHVDSFLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDFYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVOLBLAETDliS QGVARW Y DTjMDN KDKGS S 


| 5427 

i 

i 

1 


a;. 


3435 


ATSSQS LG RADPPRGG1MER S PGEG PS P S PMDQP S A? SDPTI)QP 
PAAHAKPDPGSGGQPAGPGAAGEALAVTLTSFGRRLLVLIPVYLA 
GAVGLSVGFVLFGLALYLGWRRVRDEKERSLRAARQLLDDEEQL 
TAKTLY MSHREL PAW VSFPDVE KAE WLNKI VAQVWP FLGQYMEX 
LLAETVAPAVRGSNPaLQTFTFTRVELGSKPLRIIGVKVHPGQR 
KEQI LLDLNI S YVGDVQIDVEVKKY FCKAGVXGMQLHGVLR VI L 
EPLIGDLPFVGAVSMFFIRRPTLDI NWTGMTNLLDI PGLSSLSD 
TMIMDS I AAFLVLPNRLLVPLVPDLODVAOLRSPLPRGI IRIH1* 
LAARGLS SKDK YVKGLIEGKSDPY ALWLGTQTF C SR VI CEELN 
POWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFLGRMiCLDVGKV 
LOASVLDDWFPLOGGQGQVHLRLEWLSLLSDAEKLEQVLQWmJG 
VSSRPDPPSAAILWYLDRAQPLPKVTSELYPPOLKXGNKEPNP 
MVQLS 3 QDVTQES KAVYSTNCP VWEEAFRFFLQDPOSQELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDOWFQLSSSGPNSRLYM 
KLVMRILYLDSSEICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTEHVLRIHVLEAODLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEV1VTSVPGQELEVEVF 
PKDLDKDDFLGRCK\nUjTI^LiKSGFl(DEWLTLEDVPSGRLHIiRIj 
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SEQ 
ID 

HO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresDondi no 
to first 
amino acid 
reeidue of 
amino acid 
sequence 


Amino acid segment containing sior.aj. peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=G2ycine, 
K=Histidine, l=Isoieucine« K^Lysinc , 
L=Leucine, M«Me thicnine , N«Asparag: ne , 
P- Proline, Q=Glutatr.ine , R=Arginine, 
ScSerine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Cccon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAA2LEEVL0VNSLIQTQKSAELAAALL5 1 YMSRAED 
LPL^KGTKHLSPYATIjTVGDSSHKTKTISOTSAPVV.'DESASFIjI 

kkphteslelovrgegtgvlgslsi*plsellvajx:lcldrwftl 
ssgqgqvllraqlgilvsqhsgveahshsyshsssslseepels 

GGPPHITSSAPEVVRORbTHVDSPLEAPAGPLGOVKLTLWYYSE 
ERKLVSIVHGCRSLRQ.MGRDFPDPYVSLLLLPDKJ'JKGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNESFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 


3 


1839 


SS R S SRLSACAI APPWLVS SKPAR PAOLQRPGJOMVEDGAEEliED 
U^HFSVSELPSRGYGVMEEIRROGKLCDVTliKIGDHKFSAHRIV 
LAAS I PYFHAMFTNDMMECKQDE 1 VM0GMDPSALEAL3 N FAYNG 
KLAIDQQNVQSljLMGhSFLOhQS I KDACCTFLRERLKPKNCLGV 
RQFAETMMCAVLYDAANSFIH0HFVEVSMSEEFLA1FLEDVLEL 
VSRDELNVKSEE0VFEAAIAV , 7VRYDRE0RGTPli\RNLQSNIRLL 
FCRPQFLSDRVQODDLVRCCHKCRDLVDEAKDYLL^FERRPHLP 
AFRTRPRCCTS I AGLIYAVGGLNSAGDSLNWEVFDF 1 ANCWER 

CXi V MTVk T3 Ctj VflVMATNCn A .Vlil CXI Y TV^OT .UT .CTiJ OtiVNT T PIT 1 
V-.rv.rrj A \ r\l\ D-Tv Vu VnV V Vj Xj U 1 rt JL W 1 IAJVUiUm 1 V ^./-i I H J fi 1 U 1 

WTKVGSMNSKRSAMGTVYLDGQI YVCGGYDGNSSIjSSVETYSPE 
TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 
YKHHTATWHPAAGMLNKRCRHGAASLGSKMFVCGGyDGSGFLSI 
AEMYSSV\ ADQWCLI VPM\ HTRR \ SRVSLGGPAVGRLYAWJG VT 

I 


5429 


826 


202 


RRE DALS SEG CLW PSES TV S GNG1 P E PQ VYAPPRP 7DR LAVP P F 
AORERFHRFQPTYPYLOHEIDLPPTISIiSDGEEPFpyQGPCTIiQ 
LRDPEQQLELNRESVRAPPKRT1FDSDI.MDSARLGGPCFPSSNS 
G1SATCYGSGGRMEGPPP\TYSEV3GHYPGSSFQH00SSGPPSL 
LEGTRLHHTH3 APLESAA3 V)S KE KDXQKGHPL 


5430 


441 


1507 


QKRRKRRRKKlMKTIQP^iHNSlSWAIFTGLAALCLFQGVPVRS 
GDATFPKAMDNVTVRQGESATLRCT IDI^RVTRVAWINRST 3 LYA 
c Km vw n cc\r\n t c\7T , /tt , ov c t it T/^iAn/rn/vncv^ dytpci/athh 

IjlML'KWCLi.UJrK V VijJjojv I y I yi$lC*±\JjXVUVjUtL\3t' i J Li V \Ji UN 

hpk7s rvhli vqvs pki ve1 s sdi s i negnnisltc i atgrpep 
tvtwrh i s pkavgfvsede yle1 qg i treqsodye c s asndv\ a 
apvWrrvkvtvnyppyiseakgtgvpvgqkgtloceasavpsa 
efowykddkrli/egkkgvkvenrpflskliffnvsehdygnyt 
cvasnklghtnasimiifgpgavsevsngtsrragcvwllpllvl 

HLLLKF 


5431 


2 


- 1312 


AA^PGSRRRRPLPDRPHMAKGYEAPPPPAPRSPAWKARSKPVV 
LPG1TINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQ0 
KKPJL.EAFLTQKAKVGELKDDDFER1SELGAGNGGWTKV0HRPS 
RLT M7\ RKI.THI.FTTfPAT R?\JO 7 T R FT /WI .H FCN*? P Y T VG F YG AF Y 
SDGEISICMEHKDGGSLDQVLKEAKRIPEEIIiGKVSIAVLRGLA 
YLREKHQ I r^RDVKPSN 1LVNS RGEI KLCDFGVSG0L1 DSMANS 
FVGTRS YKAPERLQGTHYSVQSDI WSKGhShVELP. VCR YPIPPP 
DAKELEAI FGRPWDGEEGEPH S I SPRPRPPGRPVSGKGMDSRP 
Af^AlFELLDYlWEPPPKLPNGVFTPDFQEFVNKCblKNPAERA 
DLKMLTJWTFIKRSEVEEVDFAGWLCKTl/RriNQPGTPTRTAV 


5432 


2 


1312 


AAAAPGSRRRRPLVDRPHMAHGYEAPPPPAPRSPAWRARSKPV\ 
LPGITIl^P\TIAEGPSP\TSBGASEANLVDliQKKLEEI-ELDEQQ 
K KR L EA FLTQKAK VGEL KDDDFER I S ELGAGNGGWTKVQHRP S 
GLI MARKLI HLEI KPAIRNQ1 IRELOVJbHECNSPYJ VGFYGAFY ! 
SDGE I SI CMEHMDGGSLDQVLKEAKRI PEEILGKVS 1 AVbRGLA 
YLREKHQIMHRDVKPSNILWSRGEI KLCDFGVSGC'LI DSMANS 
FVGTRSYMAPERLQGTHYSVOSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAIFGRPWDGEEGEFHSISPRPRPPGRPV£GHGMDSRP 
AMAI FELLDYIVNEPPPKLPNGVFTPDFQEFVNKCI j knpaera 
DLW^LTNKTFIXRSEVEEVDFAGWLCKTLRLNQPGTFTRTAV 


5433 


360 


1885 


SVQEDKVGFEDPUiLCSWRARACPCTWPHC/CTGLLECLGFAGV 
LFGKPSLVFVFKNEDYFKDLCGPDAGPIGKATGQADCKAQDERF ; 
SLI FTLGSFMNNFMTFPTGY3 FDRFXTTVARMAI FF YTTATLI j 
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BNSOOCID: <WO 01&3312A1 J_> 
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PCT/US00/34263 



SEC 

m 

, NO: 

[ 


Predicted 
beginning 
nucleotide 
locaticn 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot i de 
location 
corresponding 
to first 
amino acid 
residue ol 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«= Phenylalanine , G=Glycine, \ 
H»Histidine, l^Isoxeucine, K=Lysine, 
L= Leucine, [^Methionine, N=Asparagine, ; 
P=Proline, OGlutamine, R=Arginine, | 
S=Serine, T=Threonine, V-Valine. 
W-Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, ; 
\=possible nucleotide insertion) j 


i 

1 

i 






IAFTSAGSAVLLFLAKPMLT1GGIL.FL1TNLQIGNLFG0HRST3 \ 
ITLYNGAFDSSSAVFLI IKLLyEKGISLR/VLLHLHLCLQYLAC ' 
STHFP PDAPG AH PIP TAP QLOL W P VP WEWHH KGK EG /QQhS MKT 
GSYSQRSSFQRRKRPQGQGRSRNSAPSGATL/CSRRFAWHLVWL 
SV I QLKHYLr 1GTLNSLLTNMAGGDKARVSTYTNAFAFTQFGVL 
CAPWNGLLMDRLKQKYOKEARKTGSSTLAVALCSTVPSLALTSL 
LCIjGFALCASVPILPLQY LTF I LQV I SRSFLYGSN AAFLTIAFP 
SEHFGKLFGLVNIALSAWSLLQFP1FTL3KGSLQNDPFYVNVMF 
MLAI LLTFFHPFLVYRECRTWKES PS A1A ; 


5434 


"66 ~j 


652 


RyAALIISLIQHKLLWRNOHCSRCVlMSPAQSAGLNWLF/GSGK ( 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV : 
LRQGRFGfoFIGCINYPECEHTELIDKPDETAITCPQCRTGHLVQ . 
RRS R YGKT FHSCDR Y P ECQF Al NFKPIAGECP ECHY PLLl EKKT 
A0GVKHFCASKQCGKPV5AF 1 


S43S 


4704 


1597 


PGDSSQRIAEMSNAKERKHAKW4RHQPTNVTLSSGFVADRGVKH 
HSGGEKPFQAOKOEPHPGTSRQROTRVNPHSLPDPEVNEQSSSK 
GMFR KKGG WKAG PEGTSQEI P K Y 1 TAS TFAQARAAE I S AMLKAV 
T0KSSNSLVF0TLPRHMRRRAKSHNVKRLPRRLOEIAOKEAEKA 
VH0KXEHSKNKCHKARRCHKNRTLEFNRR0KKNIWLETHIW1IAK 
RPHMVKKWGYCLGERPTVKSHRACYRAMTNRCLLODLSYYCCIjE 
LKGKEEE I LKALSGMCNI DTGLTFAAVHCLSGKRQGSLVLYRVN 
KYPREMLGP VTF I WKSQRTPGDPSES RQLWI WLHPTLKQD ILEE 
IKAACQCVEPIKSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGENAKPI KKI IGDGTRDPCXPY S W I SPTTG III SDLTMEMN 
RFRLIGPLSHSILTEAIKAASVHTVGEIJTEETPHRWWIETCK.XP 
DSVSLHCROEAIFELLGGITSPAEJPAGTILGLTVGDPRINLPQ 
KKSKALPNPEKCODNEKVRQLLLEGVPVECTHSFIWMQDICKSV 
TENK I SDQDLNRMRSELLVPGSQLl LGPHESKI P I hhJ QQPGKV 
TGEDRLGWGSGWDVLLPKGWGMAFWIPFIYRGVRVGGLKESAVH 
SQ Y KR S PNVPGD FPDC PAGMLFAEE OA KNLLE K Y KRRPPAKR PN 
YVKLGTIjAPFCCPWEOLTODWESRVOAYEEPSVASSPNGKESDL 
RRSEVPCAPMP KKTHQ PS DE VGTS 1 E H PRE AEE VMDAG CQESAG 
PERI TDQEAS EN HVAATGSH LCVLRSRKLLKQLS AMCGPS SEDS 
RGGR RAPGRGQQGLTREACLS I LGKFPRALVWVSLSbLSKGSPE 
PHTM 2 CVPAKED FLQLHEDWH Y CGPQE S KHSDP FR S K I LKQKEK 
KKREKRQKP\GRASSDGPAGEEPVAG0EALTLG1.VJSGPLPRVTL 
HCSRTLLGFVT0GDFSMAVGCGEALGFVSLTGLLDM1.SSOPAAQ 
RGLVLLRPPASLQYRFARIAIEV 


. 5436 


1781 


635 


ASDS 1 PWSEARTTRKLAORGCOWSLPERMPliWFCGLPYSGKSR 
RAEELR VALAAEGRAVYWTDAAVLGAEDPAVYGDSAREKALRG 
ALRAS VERRLS RHD WILDS LNY IKGFR YEL Y\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSWNGSAOADVPKELERE5SGAAESPALVTPD 
SEKSAKHGSGAFYSPELLEAL.TLUFEAPDSRNRWDRPL.FTLVGL 
EEPLPLAG3RSALFENRAPPPHOSTCSOPLASGSFLHQLDQVTS 
OVLAGLMEAOKSAVPGDliLTLPGTTEHLRFTRPLTr^AELSRLRR 
QFISYTKMHPNNENI.POLANMFLQYLSQSLH 


5437 


739 


1672 


C0EAASEFGGPLH7PAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
P S KEE PQVEQLGS KRMDSLKW DQ P IS STQESGRLEAGG AS PKLR 
WDKVD SGGTRR PG VS PEGGL \G V PGPG APLEKPGRREKLLGWIiR 
GEPGAPSRYLGGP EECLOI STNLTLKLLELLASALLALCSRPLR 
AAIX)TLGLRGPLGLWLHGLLSFLAALKGLHAVLSLLTAHPLHFA 
CLFGLLQALVLAVSLREPNGDEAATDWESEGLEREGEEQRGDPG 
KGL 


5438 


2443 


11S2 


TKPRKRRHQPASQRORPWSSDSTGDLLARGKGRKEENKGSDRVS 
IAPPSLRRPMMCOSEAJIQGPELRAAKWLHFPQLALRRRLGOLSC 
MSRPAIiKLRSWPLTVLYYLLPFGALRFLSRVGWRPVSRVALYKS 
V PTRbLS RAWGRLNQ VELPH W LRRP VY S LY I WT FGVNMKE AAVE 
DLHHYRKLSEFFRRKLKPOAKPVCGLHSVISPSDGRILNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQIiVT 
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BNSOOCID: <WO 0153312A1J.: 
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SEC 
3D 
NC: 

j 

j 
{ 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
and no acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca ti er, 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid F.ecnient containing sional peptide 
IAs Alanine, =Cysteine, D=Aspsrtic Acid, E» 
Glutamic Acic, F=Phenylalanine , G«Glycine, 
K=Histidine, 3~Isoleucine, K^Lysine, 
L=Leucine, tt*Methicnine, N=Asp&ragine , 
P=Proline, C-Glutamine, R=Arginine, 
S=Serine, T=7hxconine, V=Valir,e, 
w=Tryptophan , y=Tyrosine, X=l1nl<nown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


! 
i 

! 






REGNEbYKCVlVlJi.PGDYHCFHSPTDWTVSKRRHFPGSLMSVNP 
GMAR MI K£LFCHKEK WLTGDWKHGFPSLTAVGAT\ NWGS XR1 Y 
FDRDLHTKSPRH5 KGSYNDFSFVTHTNREGVPMALRGEHLG/OS 
FNLGSTIVLlFErtPKDFNFOLKTGQKIRFGEAliGSl. 


5425 


2443 


1152 


TKPRKRRHQPASCRCRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRRPMMCCSEAROGPELRAAKWLHFPCLAIiRRRLGOL.SC 
MSKPALKLRS WPLTVLYYLIjP FGALRPLS R VGWR P V SRVAL YK5 
VPTRLLSRAWGKlNCVELPHWLRRPVySLyiV'TFGVNMKEAAVE 
DLHHYRNLSEFFKRKLKPOARPVCGLHSVISPSDGRILNPGOVK 
NCEVEOVKGVTYSLESFLGPRMCTEDLPFPF/'ASCDSFKNQLVT 
REGNELYliCVIYLfvPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
G{*ARW i KELFCHK VVLTGDWKHGFFSLTAVG AT \KWGS IRtY 
FDRDLKTNSPRHS KGSYNDFS FVTHTMREGVFKALRGEHLG / QS 
FNLG STI VTLI FEA? KDFNFQLKTGQKIRFGFJ-XG Sl> 


54 4 C 


692 




EFiPVTPDHRLVTKTKIV\OTFSPVNS\G0PFNVEMLKEEOEVA 
MLGAP}INPAPPr<?TVlHIRSETSVPDHWWSLFNTLFMNTCCLG 
FIAFAYSVKSRDRXr':VGDVTGAQAVASTAKCLNl WALI LG3 FMT 
ILL3 1 1 PVLWQAQk 


5443 


2 




CRDGGJCWGFMVSFMKPLE3 XTQCSGPRMDPK3 CPADPAFFS FIN 
NSDLVr7ANIETGi".ERRLTFCH0GLSNVLDDPKSAGVATFVI0EE 
FDR FTG Y WWCPTAS WEGSEGLKTLRI LYEEVDESEVEV I HVPS P 
ALE E R KTDS YR Y F R TGS KN PK I ALKLAE FOXDSOGK I VS TOE KE 
LV0PFSSLFPK\'EVIARAGWTRDGKYAV/AJ1FLDRP00WL3LVLL 
PPALFJPSTENEEC\KLASARAVPRNVQPYWYEEVTWVWIN\^H 
0IFYPFPOSEGELELCFLRANECKTGFCHLYKVTAVLKSOGYDW 
SEPFSPGEGEOSLTNAIWVWEETKLVYFQGTKDTPLEHHLYWS 
YEAAGEIVRLTTPGFSHSCSKSONFDMFVSKYSSVSTPPCVHVY 
KL^GPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPTVL FVYGGPOVQLVNNS FKG I K YLRLNTLASLG Y 
AVWI DGRGSCOI'-GLRFEGALKNOMGCVEI EDQVEG hQ FVAE K Y 
GFIDLSRVAIHGV^SYGGFLSLMGLIHKPOVFKVAIAGAPVTWM 
AYDTGVTERYMDVPENNQHGYEAGSVALHVEKLPNEPNRLLILK 
GFLDEMVHFFHTIn FLVSOL3 RAGKPYCLOVALPPVSPOI YPNER 
HS3RCPESGEHYEYTLLKFL0EYL 


5442 

j 

1 
l 


1 


34 74 


CGQRS RR RS PDMP E A K ? AAKKAP KG KDAPKGA ? KEA P P KE A PAE 
APKEAPPEDQSF1AEEPTGVFLKKPDSV3YETGKDAVWAKVr>5G 
KELPDKFTIKWFKGKWLELGSKSGARFSFKESH>NSAS!rVYTVEL 
HIGKWLGDRGYVKLEVKAKDTCDSCGFKIDVEAPRQDASGOSL 
ESFKRTSEKKSDTAGELDFSGLLKKREWEEF.KKKKKKDDDDLG 
IPPEIKELLKGAKKSEYEKIAFQYGITDLRG^LKRLKKAXVEVK 
KS AAFTKJGLDPAYOVDRGNK 1 K1MV31 SDPDLTLKWFKNGQEI K 
PSSKYVFENVGKKP-aLTINKCTLADDAAYEVAVKDEKCFTELFV 
KEPPVLI VTPLEDOQVFVGDRVEMAVEVSEEGAOVMWMKDGVEL ' 
TREDSF KARYRFK KDGXRHIliI FSDWQEDRGRYQV1TNGGQCE 
AELI VEEK0LEVL0B3 ADLrVKASEQAVFKCEVSDEKVTGKWYK 
NGVEVRPSKRIT3 SHVGRFHKLVIDPVRPEDEGDYTFVPDGYAL 
GSLSAKLKFLEIKVEYVPK0\EPPKIPLGFASGGKTSENAD/1V 
WAGN K LR LDV\ S J TG E APS PFAT\ WLKG \ D E V FTTTEG R ?R I E 
KRVDCSSFVIESACREDEGRYTIKVTNPIGEDVASIFLOWDVP 
DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSO 
RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSOPSMN j 
TKPFMPIAPTSEPLHLIVEOVTDTTTTLKWRPPNRIGAGGIDGy 
LVEYCLEGS EEWVP ANTE PVERCGFTVKNLPTGAR I LFR WGVN 
IAGRSEPATLAOP^IREIAEPPKIRLPRHLROTYIRKVGEOLN 

T \rv/DimCVDP VfW7\ r \vTVfZr , 'b. DT nTCDX/UVDTCnmTX/FFVROIVA 
JjV Vrf \}y:l\r J\r \J V v »% J rtAayjnlr LiLt J cjitv nvKiSUf u l vrr vny/irt 

RSDSGEYELSV0IEN 1 MKDTATIRIRWEKAGPP3NVMVKEVWGT 
NALVE WQAP KDDGN'S E I MGY FVQKADKKTflEW FNV YERNRHTSC 
TVEDLIVGNEYYFRVYTENICGLSDSPGVSKNTARILKTGITFK 
PFEYKEHDFRMAPKFLTPLIDRVWAGYSAALNCAVRGKPKPKV 
WfMKNKME 3 REDPK FL3 TNYQG VLTLK I RR PS F FDAG TYTCRAV 
NELGEALAECKLEVRV PQ 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Aoino acid secmen: containing sign*.! peptide 
fA=Alanine, OCysteine, Dispart ic Acid, E= 
Glutamic Acic. F=Phenylalanine , G=Gjycine., 

1 H^Histidlne, 3=2soleucine, K=Lysint, 
L=Leucine, M=Methionine , N=Asparag:ne, 
P= Proline, Q=Glutamine, R=Arginine, 
£=Serine, T= Threonine, V=VaIinc, 

1 Ks Tryptophan, Y«= Tyrosine, X=Unknowr., *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5443 


66 


1003 


SRGOIiDAGOSSEOKGGNROPEQSRSRSSSSSSSPKKSRSAAEPA 
MAbSMFLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKGWFSVTTVDLKRKPADLQNLAFGTHPPF37FNSEVKTDV 
NKIEEFLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAYIKNS 
RPEANEALERGLLKTLQKLDEYLNSPLPDEIDENSKEDIKFSTR 
KFLDGNEMTLADCNLLPKliKl VKWAKKYRNFDI F KEMTG I KRY 
LTNAYSRDEFTNTCPSDKSVEI\AYSDVAKRLHOVXSRLLKEVS 
FMSSP 


5444 


2 


344 


5 G p I GVTGAQMAKWLRDYLSFGGKR PP PQP PTPDYTBSDl LRAY 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGS?KHRL 
1 KVE AADMAJ^KALLGGPGEELE ADTEY bDPFDAQr HPAF PDDG 
YMEPYDAOWVMSEL-PGRGVQLYDTPYEEQDPETADGPPSGOKPR 
0SRMPOEDERPADEYDQPWEWKKDHISRAFAVOFDSPEWERTPG 
SAKELRRPPPRSPQPAERVDPALFLEKQPWFHGPbNRADAF.Sbb 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENOW 
LGQHS G P FPS VPEL VLKYS SR PL P VQG AEHLALL Y P WTQTF * Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHKERH?EGLP 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRC0VWFSOAFAH 
OGGGCGYGOSCGPSGRPRGGAGSRK 


5445 


2364 


486 


J LSRGFLGS VEIC10LPLPASEPVLLLTWARRRWRETRSRREFT 
TLRAQS VCPWWI * ETRMNRS IPVEVDESEPYPSQLLKPI PEYS P 
EEESEPPAPNIRNMAPNSLSAPTMLHKSSGDFSOAHSTLKbANH 
CRPVSROVTCLRTOVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHOFSFMEKRNQWbVSQLSAASPDTGHDSDKSD 
CSbPNASADSLGGSOEMVORPQPHRNRAGLDbPTICTGYDSOPQ 
DVLGlRQLERPbPbTSVCYPQDbPRPLKSREFPQFEPQRYPACA 
CMbPPNbSPHAPWNYHYHC?GSPDHQVPYGHDY?RAAYQOV10P 
AbPGQPbPGASVRGLHPVOXVlbNyPSPWDOEERFAORjDCSFFG 
LPRHQDOFHHOPPNRAGAPGESbECPAELRPQVPOPPSPAAVPR 
FPSNPPARGTLKTSNLPEELRKVFITYSKDTAMEWKFVNFLLV 
NGFQTA I DI FEDR IRGI DI I KWMER YLRDKTVWI I VAIS PKYKQ 
CVEGAESCbDEDEHGLHTXYIHRMMQlEFlKQGSMKFRFIPVLF 
PNAKKEHVPTWLONTHVTSWPKNKKNILLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SS WS WCTGRMRKTRJjWGIiliWKLFVSELRAATKbTE E K YELK EGQ 
TLDVKCDYTLEKFASSQKAWQIIRDGEMPKTLACTERPSKNSKP 
V0VGR I I LEDYHDHGLLRVRMVNL0VEDSGLYQCV1 YQPPKEPH 
yjLFDRIRLWTKGPSGTPGSNENSTQNVYKlPPTTTKALCFLYT 
TPRTVTOAPPKSTADVSTPDSEINLTNVTDJ 1RVPVFN2 VI LLA 
GC-FLSKSbVFSVLFAVTLRSFVP*AHEPTRWSSDFOPHPSGSCA 
KGGGRR 


5447 

! 

i 

! 

| 


207 


617 


KTARTLS LMAS LVAYDDSD5 SAETEHAGSFNATG00KDTSG VAR 
P PG0E> FASGTLDV PKAGA0 PTKHG S CEDPGG YRLPLAQLGR S DR 
GSCPSQRLOWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKOVKLSRNFPKSSFKAQSESETVGKNGSSFQKKKCEDCWPY 
TFRRLROROALSTETGKGKDVEPCGPPAGRAPAPLYVGPGVSEF 
ICPYbNSHYKETTVPRKVLFHLRGKRGPVNTJOWCPVLSKSKWL 
L*ST S MD KT F KVWNAVD SGHCbQTYS LHT E AVRAARK AP CGRR I L 
SGGFDFALHLTDLETGTQLFSGRSDFR I TTLKFHPXDHN3 FLCG 
GFSSEMKAWDIRTGKVMRSYKATIOOTbDILFLREGSEFLSSTD 
ASTRDSADRT1 1 AWDFRTSAKI SNQIFHERFTCPS1ALKPREPV 
FLAOTNGNYLAbFSTVWPYRMSRJRRRYEGHKVEGYSVGCECSPG 
GDLLVTGSADGRVLMYSFRTASRACTLQGHTQACVGTTYHPVbP 
SVLATCSWGGDMKlWH*AFHWLSLGEAIGDLAPARGySGPGRSL 
KSPSPSKSbLVLLCGRAMFQPATCPWQbPALSK 


I 5448 

\ 


194 


1833 


KASKVTDAIVWY0KKIGAYDQQIWEKSVE0RE3KGLRNKPKKTA 
KVKPDLIDVDbVRGSAFAKAKPESPVJT'SLTTKGIVRVVFFPFFF 
R WLQVTSKV3 FFWLL VL YLLQ VAA 3 VL TCSTSSPUSI PLTE VI 
G PI WLMLbLGTVHCQI VSTRTPKPFLSTGGKRRRKLR KAAKbEV 
K REGDGS STTDNTOEGAVONHGTSTSKSVGTVFRDL : /WAAFFbS 
GSKKAKNSIDKS TETDNG YVS LDGK KTV KSGEDG 1 QNH EPQCET 
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SEQ 
ID 
NO: 


Predicted 
begirminc 
nucisotj.de 
location 
correspondinc 
to firs; 
amino acid 
residue o: 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G^GIycint, 
H=Histidine, I=Isoleucine, K-Lysina , 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proiine, Q=Glut amine, R=Arginine, 
S-Serine, Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSiCDTQRTITNVSDEVSSc:EGPETGySL 
RRHVDRTSEGVLRNRKSHHYKKHYPNE3APKSGT£-CSSRCSSSK 
QDSESARPESETEDVI^WEDLIjHCAECHSSCTSETDVENHQINPC 
VKKEYKDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADM5 
VLB I SGM2 MNRVNSHI PGIGYOI FGNAVSL3 LGLTPFVFRLSQA 
TDLE0LTAHSASELYV1 AFGSNEDV1 VLSMV1 1 S ~ VVRVSLWJ1 
FFFLLC V AERTYXQVGI M * TSEGVLRNRKSHHYKXHY PNEDAPK 
SGTSC^SRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SET0VENHO1NPCVKKEYRDDPFHOSHLPWLHSSKPGLEKJSAI 
VWEGND CKKADMS VLE 1SGMI MNRVNSH I PG3 GYOI FGNAVS LI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSKEDVIVLSMV 
I ISFWRVSLVWI FFFLLCVAERTYKOVGIM 


5449 


194 


1833 


MASKVTDA2VWYOKK2GAYDQQIWEKSVEOREIKGLRNKPKKTA 
HVKPDLIDVDLVRGSAFAKAXPESPWTSLTTKGIVRWFFPFFF 
RWWLQVTSKVI FFWLLVLYLLQVAAI VLFCSTSS PHSIPLTEVI 
GPIWLKLLLGTVHCQIVSO'RTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDMTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNSIDKSTETDNGYVSLDGKKTVKSGEDG3 QNHEPQCST 
I RPEETAWKTGTLRNGFS KDTQRT I TUVSDBVSS EEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC ' 
VKKEYF.DDPFKOSHLPWLKSSHPGLEKISAIVWEGNTICKKADMS 
VLE I SGM I MNK VNSHI PGI GYOI FGNAVSLI LGLT P FVFRLSOA 
TDLEQLT AH SASELYVI AFGSNECV1 VLSMVI I S F WRV SLVW1 
FFFLLCVAERTYKQVGI«*TSEGVLRNRKSHHYKKHYPNEDA?K 
SGTSCSSRCSSSRODSESARPESETEDVLWEDLLKCAECKSSCr 
SETDVEKH03NPCVKKEYRDDPFHCSHLPWLHSSHPGLEKISAI 
VWEGNDCKKACMS VLE I SGMIMNR VI* S H I PG1 G Y 0-1 FGNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYV1AFGSKHDVIVLSMV 
1 3 SFWRVSLVWI FFFLLCVAERTYXQVGIK 


5450 


8i3e 


1242 


GQOFAS FFG * NHPE VT VAMALTD1 DLQLQFSMSQPE ALLLLAAG 
PADHLLLQLYSGHLQVRLVLGOEELRLQTPAETLLSDSIPHTW 
LT WEG WATLS VDG FLNAS S AVPGA PL E VP YGLF VGGTGTLG L P 
YLRGTSR PLRGCLHAATLNGRS LLR PLTPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTOSROAPLAFQAGG 
RRGDF} Y VDI FEGHLRAWEKGOGTVLLHNSVPVAJDGOPHEVS V 
HINAHRLElSVrOYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRLG LTPE ATNASLLGCW EDLS VNGQR RGLR E ALLTRNMA 
AGCRLEEEEYEDDAYGKYEAFSTLAPEAWPAT-tELPEPCVPEPGL 
PP VFANFTQLLT IS PL WAEGG TA WLE WRH VQP TLD LMEAELR X 
SQVLFSVTRGAHYGELELDIljGAOARKMFTLLDVVJWKARFIHD 
GSEDTSDQLVLEVSVTARVPMPSCLRRGQTYLLP 2 OVNPVNDP? 
H 1 1 FPKG S LMV I L EHTQK PLGP E V FQAY D PDS ACEG LTFQ VLG T 
SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 
VSDGLQAS PPATLKWAIRPAIQI HRSTGLRLAQGS AMPILPAN 
LSVETNAVGODVSVLFRVTGALQFGELQKHSTGGVEGAEWWATQ 
AFHQRDVEOGRVRYLSTDFQHHAYDTVENbALEVQVGQEILSNL 
SFPVT30RATVWNLRLEPLHT0NTQ0ETLTTAHLEATLEEAGPS 
PPTFHYEWQAFRKGNliQLQGTRLSDGQGFTODDlCAGRV'rYGA 
TARAS EA VEDTFRFRVTAPPY FSPLYTFPIHI GGDPDAPVLTNV 
LLWPEGGEGVLSADHLFVKSLNSASYLYEVMERPKLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD 1 PFVATRQGE 
S SGDMAWEEVRG VFRVA I QP VNDHAPVQT I SRI FHVARGGRRLL 
TTDDVAFSDADSGFADAOLVLTRKDLLFGSIVAVDEPTRPIYRF 
TOEDIJ?KRR\^FVHSGADRGWIQLOVSX)GOHOATALLEVQASEP 
YLRVANGSSLWP0GGOGTIDTAVLHLDTNLDIRSGDEVHYHV7 
AGPRWGOLVRAGOPATAFSQODLLDGAVLYSHNGSLSPEDTMAF 
S VEAGP VHTDA1XQ VTX ALEGPLAPLKLVRHK KI YVFQGEAAE I 
RRDQLEAAQEA VPPAD I VFSVKS PPSAGYLVM VSRGALADEPPS 
LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 
GVLVELE VLPAAI PLEAQNFSVP EGGSLTLAP PLLRVSGPYFPT 
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i SEQ 
ID 
NO*. 


Predict ec 
becinninc 
nucT'P.ot id* 
locat ior. 
corresponding 
to firs! 
amino acic 
residue ol 
amino acic 
sequence 


Predicted end 
nucleotidt 
location 
corresponding 
to first 
amino acic 
residue oi 
amino acic 
sequence 


Ammo acid segment containing signal peptioe 
(As-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H-Histidine, I»Isoleucine, K=Lvsine, 
L= Leucine, ^Methionine , N-=Asparagine, 
F=Proline, Q=Glutamine, R*Arginine, 
E= Serine, ?-= Threonine, V=Valint, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *sStop 
Codon, /-^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKSDGPQARTLSAFSWRMVEEQLIRYV 
HDGSETLTDS FVLMAN AS EMDRQSHPVAFT VTVLPVNDQP? 1 LT 
TNTGLQMWEGATAP1PAEALRSTDGDSGSEDLVYTIEQPSHGRV 
VLRGAPGTEVRS FTQAQLDGGLVLFSHRGTLDGG FPFRLSDGEK 
TS PGH FFRVT AQKQVLLSliKGSQTLTVCPGSVQP LSSQTLRASS 
SAGTDPOU.IjYRWRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GN 1 LYEHENiPPEPFWEAHDTLELQLSSPPARWAATLAVAVSFE 
AACPORPSHLWKNKGLWVPEGORARITVAALDASNLLASVPSPQ 
RS EHDVLFQVTQr PSRGOLLVSEE PLHAGOPHFLQSOLAAGObV 
YAHGGGGT00DGFHFRAHL0GPAGASVAGP0TSEAFA1TVRDVN 
EKPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVC 
RAPHNGFLSLVGGGbGPVrRFTOADVDSGRLAFVANGSSVAGIF 
QbSMSDGASPPLPMSLAVDILPSAIEVQLRAPLEVPQALGRSSL 
SQQQLRVVSDREEPEAAYRLIQGPOYGHLLVGGRPTSAFSQF0I 
DOGEWFAFTNFSSSHDHFRVlJVU^GVNASA\AmVTVRiU>LKV 

W RV P RARTE PGG SQl»V EQ FTQQDLEDGRLGLEVGRPEG RAPGP 
AGDSLTLELW AQG VP PAVAS LDF ATEP YNAARP Y S VALLSV P EA 
ARTEAGKPESSTPTGEPGPKASSPEPAVAJ<GGFLSrLEAWMrSV 
1 1 PMC LVLLLLAL1 LP LLF Y LRKRN KTG KKDVQVLTAK PR NG UA 
GDTETFRKV EPGQAI PLTAV PGQGPP PGGO PDP ELLQFCRTP NP 
ALKNGQYWV 


5451 


i 


2274 


RBSSEQGRTGDTLGRPSACMDALKPPCLWRNHERGKKDRDSCGR 
KNS EPGS PHSLE ALRDAAP5 QGLN PLLL PTKMLF I FNFLFS P LP 
TPALI CI LTFGAAI FLWLITRPQPVLPLLDLKNQSVGI EGGARK 
GVSOKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNC 
PYRWLSYKOVSDRASYLGSCLLHKGYKSSPDQFVGIFAQNRPEW 
I ISELACYTYS MV AVPL YDTLG PE A 1 VH J.VNKADIAMVICDTPQ 
KALVL I GNVEKG FTPS LKV I I LMDP FDDDLKDRGE XSG I E I LS L 
YDAENLGKEHFRKPVPPSPEDLSVICFTSGTTGDPKGAMIG'HON 
IVSNAAAFLKCVEHAYEPTPDDVAISYLPLAHMFERIVQAWYS 
CGARVGFFOGD I RLLADDMKTLKPTLFPAVPRLLNR I YDKVQNE 
AKTPLKKFLLKLAVSSKFKEL0KG11RHDSFKDKLIFAKIQDSL 

v?oK VKVl V i u/virno Jovnl f . K>i/U*io \J V I CM Hj\J X lUULi 

FTLPGDWTSGHVGVPLACWYVKLEDVADMNYFTVNNEGEVCIKG 
•J 'NVFKG Y LKDPE.K TQEALDSDG WJLHTGD I GR WLPNG TLKI I DR K 
ICai FKLAOGEY I APEKI ENI YNRS0P\1-QI FVHGES LRSSLVGV 
WPDTDVLPSFAAKLGVKGSFEELC0NQWREAILEDL0K1GKE 
SGLKTFEQVKAIFLHPEPFS-IENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


5452 


1833 


1138 


SR VPS LCLS LS LSLS PSREP VAGAPGCGTAG PPAMATLWGGLLR 
LGSLLSLSCLALS VLLLAQLS DAAKNFEDVR CKC I C?P Y K EN SG 
R I YN KN I SQKD CDCLHWE PMP VRGPD VE A Y CLRCE CKYEERS S 
VTIKVTI 1 1 YXS ILGLLLLYMVYLTLVEPILKRRLFGHAQLI OS 
DDDIGDHQPFANAHDVLARSRSRANVLNKVEYAQORMKLQVOEO 
RKSVFDRHWLS 


5453 


111 


1520 


PS I PAAVPOSAPPE PHREETVTATATS QVAQQPPAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSOPSLVGSKEEPPPARSGSGGGSAKE 
POEERSOQQDDI EELETKAVGMSNDGR FLKFDI E I GRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERORFKEEAEMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTELKTSGTLKTYLKRFKVMKIKVL 
RSWCROILKGLCFLHTRTPPIIHRDIiKCCNIFrTGPTGSVKIGD 
LGIiATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVAIPEVKEII 
EG CI RONJCDE R YS I KDLLNHAF FQE ETG VR VELAEEDDG EK I AI 
KLWLR 1 EDI KKXKGKYKDNEAI EFS FDLERNV PEDVAQEMVESG 
YVCEGDHKTMAKAIKDRVSLI KRKRSQRQL* 


5*454 


111 


1520 


PS I PAA VPOS AP PE PHREETVTATATS Q VAQQP PAAAAPGEQA V 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
P0SERSQQODDI EELETKAVGMSKTJGRFLKFDI EIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERORFKEEAEMLKGLQHPNI 
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SEQ 
3D 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amir.o acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
locst.icn 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino ecid segment containing signal peptide 
(A=Alanine, OCysteir.e, D=Aspartic Acid, Es 
Glutamic Acid, Fa Phenylalanine, G«Glycine, 
H=Histidine, l=lsoleucine, X=Lyeine, 
L*Leucine, M=Kcthion2ne, N*--Asparagine , j 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, ! 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 








VR FY DSKES TV KG KK C I VLVTELMTSGTLKTY I>KR FK VM X I KVL 
RSWCRQI LKGLQFLKTRTPP J I HRDLKCDNI FI TGPTGSVXIGD 
LG1ATLKRASFAKSV1GTPEFMAPEKYEEKYDSSVDVYAFGMCM 
LSMATSEYPYSEC0NAAQIYRRVTSGVKPASFDKVA1PEVKE13 
EGCIRQNXDER YS I KDLLNKAFFQEE7GVRVELAEBDDGEKIA J 
KLWLR I EDI KKLKG K Y KDNEA I E FSFDLERNVP EDVAQEMVESG 
YVCEGDHKTMAKA IKDRVSLIKR KRE QRQL * 




1359 


377 


LTMVSPATRKSLPKVKAMDF1 TSTAI LPLLFGCL.GVFGLFRLLQ 
WVRGKAYLRNAVWJTGATSGLGKECAKVFYAAGAKIiVIjCGRNG 
GALEELIRELTASHATKV0TKKPYLVTFDLTDSGAIVAAAAE1L 
-QCFGYVDl LVNNAG3 SYRGT1MDTTVDVDXRVMETWYPGPVALT 
XALLPSM J XRRQGHi VAISS1QGKMS J PFRSAYAASKHATQAFF 
DCLRASMEQYEI EVTV ISPGY I HTNLS VNAI TADGSRYGVMDTT 
TAOGRSPVEVAODVLiAAVGXKKKDVIIjADLLPSLAVYLRTLAPG 
LFFSLMASRARXERXS XN£ 


5456 


2 


233^ 


cgaglvaagavlvlypasragertrvpgspapsslplhspgacg 

TEVBMDP0RSPLLEVKGNIELKRPL1XAFSQLPLSGSRLKRRPD 
QMEDGLEPEKKRTRGU5ATTKITTSHPRVPSLTTVP0TQG0TTA 
CXVSXKTGPRCSTAIATGLXNQKPVPAVPVQXSGTSGVPPMAGG 1 
KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDOENQOLQDQLR 
DAOOQVKALGTERTTLEGHU^KVQAOAEQGQOELKNLRACVL.EL ' 
EERLSTQEGLVOELOKKQVELCEERRGLMSQLEEKERRLOTSEA . 
ALSSSQAEVASLROETVAOA^LTEREERLHGLEMERRRLHNQL j 
QELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 1 
SLSRSDERRGTLSGAPAPPTRKDFSFDRVFPPGSGODEVFESIA j 
MLVQSALDGYPVCI FAYGQTGSGXTFTMEGGPGGDPQLEGLI PR 
ALRHLFSVAQEJjSGOGWTYS FVASYVEI ynetvrdllatgtrkg : 
0GGECF.1 RRAGPGSEELTVTNARYVPVSCEKEVDALLHLARONR ' 
AVARTAQNERS SRSHSVFQLQI SGEHS SRGLQCGAPLSLVD1AG 
SERLDPGLALGPGERERLRETCA1NSSLSTLGLVIMALSNKESH ' 
VPYRNSKLTYLLONSLGGSAXMLMFVNISPLEENVSESLNSLRF 
ASK\ r EPSVLFGTAOSIsRKVIKTDPDL»CVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCFRAIX 


5457 


2 


2 54 0 


DDFVE R RR WTRTT CL VRS P PH VPVCGHACS WNGGSLDP LKGTPA 
LliRSAERLMRKVKXLRLDKEKTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFQVO KHSWDGLR S I IHGSRXYSGLI VNK 
APHDFQFVQXTDESGPHSHRLYYLGMPYGSRENSLIiYSElPXKV 
RXEALLLLS WXQMLDtf FQATPHKGVYS R EBELLRER KKLGVFG I 
TS Y D FHS E SGLFLFQASNSliFH CRDGG XNGFMVS PG PGCV S PMK 
PLEIKTQCSGPRMD?KICPADPAFFSFINNSDI»VTVANIETGEER 
RbTFCHOGLSNVLDDPKSAGVATFVlOEEFDRFTGYWWCPTASW 
EGSEGLKTLRILYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 
GSKNPXIALKLAEFQTDSOGXIVSTQEKELVQPFSSLFPKVEY1 
ARAGWTRDGKYAWAMFLDRPOOWLOLVLLPPALFIPSTENEEQA 
ASLCOSCPOECPAVCGVRGGHORLDOCS 


5458 


6642 


4022 


fvpglrepqwepaqpsatmsapseeeeyarLvmeaqpewlraev 
krlskeiaettrekioaaeyglavleekhqlkiiqfeelevdyea 
irsemeqlxeafgqahtnkxkvaadges reesli qesasxeqyy 
vrxvlelotelxqlrnvltntosenerlasvaqelkeinqnvei 
qrgrlrddikeykfrearllodyseleeenislqkqvsvlrqnq 
vefeglkhei krlee et ey lt3 s ql.eda i rlkei s e rqleealet 
lxtereoxnslrkelshymsindsfytshlhvsldgiikfsddaa 
epnndaealvngfehggiaklpldnktstpkkeglappspslvs 
dllselniseicklkoolmqmerekagllatlqdtqkqlehtrg 

SLoEOQt' KVTRLTENLb ALR RLUASXfcKUlAbUWliRlJKUonhJLAj 
DYYEVDINGPEILACXYHVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEKVSLLEKASRODRELLARLEKELKKVS 
DVAGETQGSLSVAODELVTFSEELANLYHHVCMCNNETPNRVML 
DYYREGQGGAGRTSPGGRTSPEARGRRSPrLLPKGIiLAPEAGRA 
DGGTGDSSPSPGSSLPSPLSDPRREPKNIYNLIAURDQIKHLQ 
AAV DRTT ELS RQR I AS OH LG P A VDXDKEALME E I LKL KS LLSTK. 
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SEQ 
ID 
NO: 


Predicted 
bee inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acic 
sequence 


Amine acid segment containing signal peptide 
{AsAianine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, FrPhenylclanine, G=Glycine, 
H=Histidine, 3=lscleucine, K=Lysine, 
L=Leucine, ^Methionine, K?=Asparagine, 
P=Proline, Q=Giut amine, R=Arginine, 
S-Serine, TV Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Cocon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






| REQ I TTLRTV • >KAN KQT AEVAL AN LK S KY ENEKAMVTE TMMKLR 

! KKTLNSLLHKA1C0KLALTQRLELLELDHEQTRRGRAKAAPKTK 
j PATPSVSHTCACASDRASGTGLANOVFCSEKHSIYCD 


545$ 


316 


1262 


RGGKRLSGMASNFKD I VKOGYVR I RSRRLG I YORCWLV FKKAS S 
KG P KRLE KFSDERAAY FRCYKK VTE LNNVKNVARLP KSTKKHA1 
GIYFNDDTSKTFACESDLEADEWCKVXiQMECVGTRlNDISLGEP 
DL1,ATGVERE0SERFNVYUV,PSPNLGCYMGECALQ1TYEYICLW 
DVOWPRVKLISMPLSALRRYGRDrTWFTFEAGRMCETGEGLFIF 
OTRDGEAIYQKVHSAALAIAEQHERLLQSVKNSMLOKKMSERAA 
SLS TMVPL PRS A YWQHI TRQMS 7GQLYRLQDVS S P1»KLHRTETF 
PAYRSEH 


546C 


45 


2057 


RPGCRAGELSTGSRARERVRKRVSAPCGQDSRRCDFEVLRGRSP 
GLGLAEMPSCG ACTCGAAAVRLITS S LAS AQRG I SGGR I HMSVL 
GRLGTFETQ1LORAPLRSFTETPAYFASKDGISKDGSGDGNKKS 
AS EG S SKKSG SGNSGKGGNQLRC P KCGDLCTHVETF VSSTRPVK 
CEKCHHFFVVLS EADS KKSI 1 KE PE SAAEAVKuAFQOKPPPPF K 
KI YNYLDKYWGOSFAKKVLSVAVYNHYKRI YNNIFANLRQQAE 
VEKOTSLTPRELEIRRREDEYRFTKLbOIAGISPHGNALGASMQ 
C0W00IPQEKRGGEV r LDSSHDD3 KLEKSNI LLLGPTGSGKTLL 
AQTLAKCLDVPFA3 CDCTTbTQAGY VGED3 ESV 1 AXLl^QDANYN 
VE XAQQGI VFLDEVDK 3 GSVPG3 K QL»3DVG GEGVQQGLLKLLEG 

ti vi^peknsr)q j rgetvovdttnii:fvasgafngldri I SRRK 

NE KYLGFGTP SK LGXGR RAAAAAPLANR SG ESNTHQtQ EEXDRL 
LRHVEARDL1EFGM1PEFVGRLP\^PLHSLDEKTLVQ1LTEPR 
NAVIPQYQALFSMDKCELI^VTCDALKAlARLALERKTGARGIiRS 
I MB KLLL EPMFE VPNSD I VCVE VL> JCE WEG KXEPGY I RAP TKES 
SEEEYDSGVEEEGWPR0APAAN5 


5461 


1481 


160 


INPPPPPKSPCGKARKWRRRRRPGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFLLLVLLLYAPVGFCLLVLRLFLGIHVFLVSCALPD 
SVLRRFWRTMCAVLGLVARQEDSGLRDHSVRVLISNHVTPFDH 
NI VNLLTTCS TPLLNS P PS ? VCWS RGF MEMNGRGELVES LKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSKPFSIQDWOPLTL 
OVQRPLVS VTVS DASWVS ELLWSLFVP FTVYOVRWLRFVHROLG 
EANEE FALRVOOLVAKE LGOTGTRLTPADKAEHMKRORH PRLRP 
OSRQSS FPPSPQ PS PDVOLAHjAOK VKEVLPH VPLG VI QRDLAK 
TGCVDLTITNLLEGAVAFMPED1TKGTQSLPTASASKFPSSGPV 
TPQPTA LTFAKS S WARC ES LQE R KOAL Y EYARRRFTERRAQ EAD 


5462 


" 663" 


- q -i 


RLSNGSFSAPSLTNSRGSVHTVSFLLQIGLTRESVTIEAQELSL 
SAVKDLVCSrVYOKFPECGFFGMYDKHLLFRHDMNSENILOblT 
S ADE 3 HEGDLVE WLS ALAT VEDFO 3 R PHTLYVHS Y KAPTF CDY 
CGEKLWGLVROGIjKCEGCG1.NYHKRCAFKIPNNCSGVRKRRLSN 
VSLPGPGIiSVPRPLOPEYVALPSEESHVHQEPSKRIPSWSGRPI 

wmekmvmcrvkvphtfavhsytrpticqyckrllkglfrqgmqc 
kdckfnchkrcaskvprdclgevtfngepsslgtdtdipkdidn 
kdi nsds srglddteeps ppepkmfflidpsdl/dverdeeavkti 
spstskn3 plmk wqs 1 xk7xrkss w/kegwnivhytsrdni,rk 
rhywrldskcltlfqnesgskyyke3plseilrissprdftnis 
ogsnphcfeii tdtmvy f vgknngds shnpvlaatgvgldvaqs 
wskai rqalmpvtp0asvc7spg0gkdhkdlsts isvsncqiqe 
wdlstwqifadevlgsgqfgivyggkhrktgiidvaikvidm 
rfptk0esqlrmevail.qnlhhpg3vklecmfetpervfwmek 
lhgdmlemilsseksrbperltkfmvtqilvalrnlhfknivhc 
dlkpenvli*asaepfpqvklcdfgfari1geksfrrswgtpay 
lapevlrs kgynrsldmwsvgvi i yvslsgtfpfnededindqi 
okaafmyp pnpkre i sgea1 dli ronllqvkmrkrysvdxslshp 
w lqdyqtw ldlre petr i g er y i thes ddarw eihaythnlv yp 
khfimapnpddmeedf 


5462 


23*7 


101S 


LLSVTi^TTSRCSHLPEVLPDCTSSAAPVVKTVEDCGSLVNGQPQ 
YVMOVSAKDGOLLSTWRTIATOSPFNDRPMCRICKEGSSQEDL 
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SEC 
1 z> 
ICC: 


Predicts 
beginning 
nucleotide 
locatio:. 
corresponding 
to firsi 
amino acjd 
residue cf 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to t irst 
amino acic 
residue oi 
amino acid 
sequence 


Amino acid segment containing sicnal peptide | 
(A^Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylal&nine, G^Giycane, 
H=Histidine, Is] scl eucine, JULysine, 
L=t>eucine, K=Methionine, N=Asparacine , 
PsProline, 0=Glutamine, R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
W= Tryptophan, Y=Tyrosine, X^Unknown, *=$top 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








LS?CECTGTbGflHRSCLEHWLSSSNTSYCEU:HFRFAVERKPR 
PLVEWLRN PGPOH E K RTLFGDMVCFLF I TPLAT I SGW LCLRGA V 
DHLHFSSRLEAVGL3 ALTVALFTI YXFWTliVSFRVHCRLYNEVJR 
RTNQRV1LLIPKSVIJVPSN0PSLLGLHSVKRNSKETW 


5464 


ISi 


677 


S PSMNPR K KVDbKL 1 1 VG A IG VGKTSMjHQYVH KTF YE E YQTTIj 
GASILSK1 1 ILGDTTLXLOIMDTGGQBRVRSMVSTFY KGSDGC2 
LA FDVTDLES FEALD I WRGDVLAKI VPMEQSY PMVLLGNKIDLA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5^65 


5271- 


334 fc 


KGDPREFIRVHREALECDYVSAHLHEW1DLIFGYK00GPAAVEA 
VNVFHHLFYEGQVDJ YNI NDPLKETATIGFINNFGQI PKQLFKK 
PHP PKRVRSRLNGDN AG I SVLPGSTSDKIFFHHLDNLR PS LT P V 
KELXEPVGQI VCTDKGI LAVEQNKVL1 PPTWNKTFAWGYADLS C 
RLGTYES DKAM TVY ECLS EWG0I LCA1 CPNPXLV I TGGTS TWC 
WJEMGTS KE KAKTVTLKOALLGHTDTVTCATAS LAYHI I VSG£ R 
DRTCIIWDLNKbSFLTOLRGHRAPVSALCINELTGDIVSCAGTY 
IHWJSINGNP1VSVNTFTGRSQQIICCCMSEMNEMDTQNVIVTG 
HSDGWRFWRMEFL0VPSTFAPEPAEVLEM0EDCPEAQIGOEAQ ' 
DEDSSDSEADEQSI SQDPKDTPSQPSSTSHRPRAASCRATAAKC j 
TDSGSDDSRRWSDQLSLDEXDGFIFVNYSEGQTRAHLQGPLSKP 
HPNP1EVRNYSRLKPGYRWERQLVFRSKLTMHTAFDRKDNAHPA 
EVTALG I S KDHS R 2 LVGDS RGRVFS WS VSDQPG RSAADH WVKDE 
GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFOSEIKRLKI 
SSPVRVCQNCYYNLQKERGSEDGPRNC 


5Sb6 


3 


S92 


HACAKASAHASGRLVRWWRKRRSVriGlQTSPVLUASLGVGLVTi 
LG LAVGS YLVRR S R R PQVTLLDPNEKYL LRLLDKTTVS HNTKRF 
RFAbPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGL 
LTYTGKGHFNI0PNKKSPPEPRVAXKLGMIAGGTG1TPMLQLIR 
AILKVPEDPTOCFbLFANQTEKDIILREDLEELOARYPNRFXLW 
FTLDH? PKDWAYS KG FVTADMI REHL?APGDDVLVLLCGPPPMV 
QLACHPNLDKLGYSOKMRFTY , 


54t7 


21C1- 




GEALRVGTRGCRRDLPDPOARIFIQKKDLEEDESVTAAHLKSRG 
RSPRKIDQFCNSSNMVUGSVTFRDVAIDFSQEEWECLOPDQRTL 
YRm^MLENYSHblSLAGSSISKPDVITLLEQEKEPWMWRKETS 
RRY PDLELKYG PEKVS PENDTSEVNLPKQVI KO I STTLG3 EAFY 
FRNDSEYRQFEGLOGYOEGNINQKMI S YEKLP'i'HTPHASL ■ CNT 
HKPYECKJECGK Y FSCGSNL I QHQSJHTGEKPYKCKECGKAFO^H 
IQLTRKQKFKTGEKTFE CKECGKAFNLPTQLNRHKN I HTVKKLF 
ECXECGKSFNRSSNLTOHOSIHAGVKPYQCKECGKAFNRGSNLI 
OHOKIHSNEKPFVCKECGKiAFRYHYQLIEHCQIHTGEKPFECKS 
CGKAFTLLTKLVRHOKIHTGEKPFECRECGKAFSLLNQLNRHKN 
IHTGEKPFECKECGKSFMRSSNLVQHQSIHAGIKPYECKECGKG 
FNRGAHLI 0K0 K I HSNE KP F VCRECEMAFRYH COb I EHSR 3 HTG 
DKPFECODCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLSQHOKTHTGEKPFECKECGKFFRRGSNLNQHRSIHTGKKP? 
ECKECGKAFRLHMHLIRHQKLHTGEKPFECKECGKArRLHMOLI 
RH0KLHTGEKPFECKECGKVFSLPT0LNRHKNIH7GEKAS 


5468 


225 


^.976 


S FLTDLFQSLAOLENLCKQLYETTDTTTRIiQAEKALVEFTNS PD 
CLSKCOLL LERGS S S Y SOLliAATCLTKljVSRTNN PLP LEQRI D I 
RNYVLNYLATRPKLATFVTOALIQLYAR ITKLGWFDCOKDDYVF 
RNATTDVTRFLODSVEYCIIGVTILSOLrNEINQVSATAFLIEA 
DTTHPLTKHRKIASSFRDSSLFDIETLSCNLLKQASGKNLNLNT) 
ESQHGLLMQLLKLTHNCLNFDFIGTSTDES SDDLCTVQI PTS WR 
SA FLDS STLQLS T I GR CE Y EKTCALbVQ LFDQSAQS YQELLQS A 
SASPMDIAVOEGRLTWLVYIIGAVIGGRVSFASTDEQDAMDGEL 
VCRVLOLMNLTDSRIAOAGNEKLELAMLSFFEOFRKIYIGDOVQ 
KS S KL YR R LS EVLG LNDETM VLS VFIGK 1 1 TNLK Y WGR CEP 1 TS 
KTLQLLNDLS 1 G YSS VRKX.VKLSAVQFMLNNHTSEHFS FLGINN 
QSHLTDMRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPLTAAFE 
AVAOMFSTNSFNEOEAKRTLVGLVRDLRGIAFAFKAKTSFMKLF 
EWI YPS YMP ILQRAI ELWYKDPACTTPVLKLMAELVHNRSQRLQ 
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SEQ 
ID 
NO: 


Predicted 
bee inning 
nucleotide 
location 
corresponding 
to first 
strdno acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino scid 
sequence 


Amino ecid segnent containing signal peptide 
|A*A2anine, C^Cysteine, D=Aspartic Acic, E=- 
GlUT.amic Acic, F= Phenylalanine , G^Glycinc, 
K=Hist idine, I»Iscleucine, K^Lysine, 
L=Leucine, M» Methionine, N=Asparagine, 
P=Prcline, 0=Glutamine, R=Arginine, 
S«Gerine, T=Threcnine, V=Valine, 
^Tryptophan, Y=Tyrosine, X-Unknown, *=Stcp 
Codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion} 








3 S I CFSKLKAALSGSY VNFGVFRLYGDDALDNALQTFI K^LLSI 
PKSDLLDyPKLSOSYYSIiLEVLTQDHMNFlASLEPHVlflYILSS 
1 5 EGLT ALDTM VCTGCCS CLDH IVTYLF KQLSR STKKR TTP LNQ 
ESDRFLHIMQ0HPEM200MLSTVLNI1IFEDCKNQWSMSRPLLG 
L I LLNE KY FS DLRN5 1 VN S QP PE KQ QAMK LCFEN LMEG I E R NLL 
TKN RDR FTQNLS AFRRE VNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 


DCEFETSLVPWHLPMGVfLCSGLLFPVSCLVLLOVASSGNMXVLO 
EPTCV3 DYMS I STC EWKMNG PTNCS TELRLL YQLVFLLS EAHTC 
VPENNGGAGCVCKLLMDDWSADNYTLDLWAGQQLLWKGS F KPS 
EHVKPRAPGNLTVJ^TNVSDTLLLTWSNPYPPDNYLYNHLTYAVN 
JWSENDPADFRIYhrVTYLEPSLRIAASTLKSGISYRARVRAWAO 
C YNTTWS EWS PSTKWKNSYREPFEQHLLLGVSVS CI V J LA VCLL 
CYVS J TKIKKEWKDQI PNPARSRLVAII IQDAQGSQWEKRSRGQ 
EPAKCPHWKNCLTKLLPCFLEHNMXRBEPPHKAAKEWPFOGSGK 
SAWCPVEISK7VLWPES1£WRCVEL?EAPVECE3EEEVEEEKG 
SFCASPESSRDDFOEGREGIVARLTESLFLDLLGEENGGFCQQD 
MGESCbLPPSGSTSAKMPKDEFPSAGPKEAPPWGKEOPLHLEPS 
PPASPTQSPDNLTCTETPLVlAGNPAYRSFSNSbSQSPCPRELG 
PDP LLARKLEEVE PEM P CV POLS EPTTVFQPE PET WEQ 3 hR R.W 
LQHGAAAAP VS A PTSGYQE FVHA VEQGGTQA S A WGLGP PG EAG 
YKAFSSLLASSAVSPEKCGFGASSGEEGYKPFQDUPGCPGDPA 
PVPVPLFTFGLDREPPRSP03SHLPSSSPEHLGLEPGEKVEDMP 
KFPLPOEO^TDPLVDSLCSGIVYSALTCHLCGHIjKQCHGOEDGG 
QTPWUVSPCCGCCCGDRASPPTTPLRAPDFSPGGVPLEASLCPA 
SLAPSGISEKSKSSSSFKPAPGNAQSSSQTFK1VNFVSVGPTYM 
RV£ 


5470 


17 


1416 


TA CR 3 K TSUVRG 1 VK J-.'DA VEM LASYGLA YSLMKFFTGPMS DF 
KWGIA/FVNS KRDRTKAVLCMWAGAI AAVFKTL I AYSDLG Y YI 
INKLHHVDESVGSKTRRAFLYLAAFPFMDA^WTHAGILLlolKY 
SFLVGCASISDVIAQWTVAILLHSHLECREPLLIPILSLYMGA 
LVRCTTLCLGYYKNIHDI I PDRSGPELGGDATlRKMLSFWWPIjA 
h I LATQR I SR P I VN LFV£RDLGG S S AATEAVAI LT AT YPVG HMP 
Y GW1/TE I RAVY PAFDKNNPSNKXjVSTSNTVTAAH I KKPTFVCMA 
LSLTLCFVMFWTPNVSEX 3 LIDI IGVDFAFAELCWPLR1 FSFF 
FVPVTVRAHL^^WLMrLKKTFVLAPSSVLRIIVLIASLWLPYL 
GVHGATLGVGSLl^GFVGEST^AIAACYVYRKQKKKMENESAT 
EGEDSAMTDMPPTEEVTD3 VEMREENE ' 


5471 


■ -Tb4b ' ' 


656 


RSSAPPG POR^AAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV 
GPGVPGEVEMVKGOPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAI KKISPFEHOTYC0RTLREI01LLRFRHENVIGIRD3 LR 
ASTLEAMRDVYJ VQDLMETDLYKLLKSUUlj5NDHi Li hi* j^J LH 
GLKY J HSANVLHRDl.KPSNLLINTTCDLKI CDFGIARIAEPEHD 
HTG FLTE YVATRWY RAPE 1 MLNS KG YTKS 1 DI WS VGC ILAEMLS 
NR PIFPG KH YLDQLNH3 LG 3 LGS PSQEDLNCI INMKARNYLQSL 
PS KTKVAWAXL.FPKSDSK?.LDLLDRMLTFNPNKRI ?VEEALAHP 
YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFO 


5472 


1469 


753 


lyvmarylsdeevavs3drlckangrspsipfgtvr1pgrarvr 
dpoalw i fg ygs lwr pdfaysdsrvg fvrg y srrfwqgdt fhr 
gsdkmpgrwtlledhegctvigvayqvqgeqvskalkylnvrea 
vlggydtkevtfypqdapdoplkalayvatpqnpgylgpafeea 
iatqilacrgfsghnleyllrvrdv^lcgpoaodehlaaivda 
vgtmlpcfcpteqalalv 


t J 


j 


211 9 


KEMFATMSKXKE0L7JCVKEC YS PhhYESQQhh I PLBELEKCWTS 
FYDSLGK1NEIITVLEREAQSSALFKQKHQELLACQENCKKTLT 
LIEKGSOSVQKFVTLSNVLKHFDQTRIXJROIADIHVAFQSMVKK 
TGDWKKHVETNSRLMKKFEESRAELEKVLRIAQEGIiEEKGDPEE 
LLRRHTEFFS0LD0RVLNAFLKACDELTD2LPEQEQ0GLOEAVR 
KLHKQWKDI.GGEAP YHLLHLK1 DVEKNRF1ASAEECRTELDR ET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

1 OC3 t 1 Oii 

corresponding 
to firs;, 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 

to first 
amino acid 
reeiduc of 
amino acid 
sequence 


Amino ecid segment containing signal peptide 
tA*Alsr.ine , C=Cyst?:ine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histjdine, J=Isoleucine , K*Ly3inc, 
L=Leuc:ne, ^Methionine, N=Asparagine, 
P=Proline, O-Glutamine, R=Arcinine, 
S»Serine, T=Threoni ne , V=Valine, 
K=Tryptophan / Y=Tyrosine, X=Unknown, *=Stop 
Codon. /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGSEKI IKEKRVFFSDKGPHHLCEKRLQLIEELCVKL.PV 
RDPVKDTPG7CHVTLKELR.^AIDSTy^KLMEDPDKWKDyTSRFS 
EFSSWISTNETQLKGIKGEAIDTANHGEVKRAVEEIRNGVTKRG 

EVEKMLSNFGDCVQYKEIVKNSLEELISGSXEVQEQABKILDTE 
NLFEAOOL'LLHHQQKTKR 1 3AKKRDVQQC 1 AQAQQGEGGL? DRG 
HE El iR K LSSTLDGLER SRERQERR I QVTLR KW ER FETN KETWR 
Y L FQTG £ S KER FLS FS S LE SLSSELEQTKE FS KR TES 1 AVQ AEN 
LVKEAS EI PLGPQNKQLLQQQAKS I KEQVKKLEDTLEEE YVI DK 
S 


54 74 


2 


78 0 


TPDVRQL9ASRRGIAVASWCSPRWFAGEEKAFVXS3WLLRQSTI 
LKRWKKNKFDLWSDGHL1 YYDDCTRQNI EDKVHMPMDCI NI RTG 
OECRDTGPPDGXSKDCMLQIVCRDGKTISLCAESTDDCLAWKFT 
LODS RTNTA Y V G S AVMTDETS WS S P P P V TAYAAPAPEVG R TLS 
LQQAYGYGPYGGAYPPGTOWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI I RERYRDNDSDLAbGMLAGAATG^LGSLFWVF 


54 75" 




SOfc 


ARGWLESLSLTCOTrPPPSSPCLLHSPETFlHTMPPNLTGYYRF 
VSQKKMEDYL0AI»N1SIAVRK1ALLLKPDKE I EHQGNHMTVRTL 
STFRT3YTVOFDVGVEFEEDLRSVDGRKCCTIVTWEEEHLVCVQK 
GEVPNRGKRHWLEGEMLYLELTARDAVCECVFRKVR 


5476 


1S2 


1457 


SDSMSLLDCFCTSRTQVESLRPEKQSETS1H0YLVDEPTLSWSR 
PSTRASEV1.CSTNVSHYEL0VEIGRGFBNLTSVHLARHTPTGTL 
VTIKITNLENCNEERLKALOKAVILSHFFRHPNITTYWTVFTVG 
S WLWV ISP FPA YGSASQLLRT YFPEGMS ETL I RNI LFGAVRGLN 
YLHONGCZ HRS1 KASHILISGDGLVTLSGI.SHbHSLVKHGORHR 
AVYDFP0FSTSV0PWLSPELLRQDLHGYNVKSD2YSVG1TACEL 
ASGOVPFGDMHRTQMLLOKLKGPPYSPLDISIFP0SESRMKNSQ 
SGVUSGJGESVLVSSGTHTWSDRLHTPSSKTFSPAFFSLVQLC 
LQ0DPEKRPSASSLLSHVFFKQMKEESQDS3LSLLPPAYNXPSI 
SLPFVLPWTEFECDFPDEKDSYWEF 


5477 




X044 


RGNSRLRYSHEDELQLPRLPELFETGRQLbDEVEVATEPAGSRI 
VQEKVFXGLDLLEKAAEMLSQLDLFSRNEDLEEJASTDLKYLLV 
PAFCGALTMKOVNPSKRLDHLORAREHFINyLTOCHCYHVAEFH 
LPKI^NKSAENKTANSSMAYPSLVAMASQROAKIQRYKOKKELE 
HR LS AM K S AVESGQADDER VRE YYLLHLQR W I DISLEEIESIDQ 
EIKILRERDSSREASTSNSSRQERPPVXPFILTRNMAQAXVFGA 
G Y PS LP TM TVSDW Y EQHRK YG ALPDQGI AKAAPEE FRKAAQQQE 
EOEEKEEEDDEOTLHRAREWDuWKJXCKPRGYGNRONMG 


5478 


2 


635 


KTVRI WPJJVKGESTVFRAKTATVRSVHFCSDGOS FVTASDDXT 
VX VWATHR OKFLFSLS QHINVJ VR CAKFS PDGR LI VSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVuFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLOHYCLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQGPATrVAFSRTGEYFASGGSDEOVMVWKSNF 
D I GDHGEVTKVPRPPATLASSMGNLTVS I LEQRLTLEEDKLKQC 
LENQQL2MQRATP 


5479 


2 


835 


KTVR 1 W V F w V KG£ 5 T Vr RAHTATVRS Vh FC5 DGQb F VTAS DDXT 
VKVWATHR0KFLFSLSQHINWVRCAKFSPDGR1.IVSASDDKTVK 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKW; 
DVRTHRLLOHYQLKSAAVNGLSFHPSGNYLITASSDSTLKILDI* 
MEGRLLYTLHGHCGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
D1GDHG5VTKVPRPPATLASSMGNLTVS I LEQRLTLEEDXLKQC 
LENQQLI MCRATP 


54B0 


44 4 


1952 


LSIjTSRMEEAELVKGRIjQAITDKRKIQEEISOKRLXIEEDKJLKH 
qhlxxkalkexwllix3issgkeqeemkkqnqqdqhq1qvleqsi 
LRLEKEIQDLEKAEI^ISTKEEAILKKLKS I ERTTEDI IRS VKV 
EREERAEESIEDIYANIPDLPKSYIPSRLRK31NEEKEBD30NR 
KALYAMEI KVEKDLKTGESTVLSSIPLPSDDFKGTG1 KVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAFVEVEELLROASERNSKSPTEYH 
EPVYANPFYRPTTP0RETVTPGPNF0ER1 Kl XTNGLG1 GVNESI 
HNMGNGLS E E R GNN FNHI S PI P P VPH PRSV 1 QQAEE KLHTPQKR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
1 oca t iori 
corresponding 
to tirst 
amine acid 

JCclQUC OI 

amino acid 
sequence 


Fredicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, Et 
Glutamic Acid, Fs Phenyl alanine, G-Glycir.e, 
H=Histidine, I=Isoi eucine, K^bysine, 
L=beucine, M=Methionine, tf^Asparacine, 
P=Proline, C>Glutaminc , R=Arginine, 
S-Gerane, T= Threonine, V=V&3ine, 
1 W=Trypt ophan# Y-Tyrosine, ){=Un?<nown, *=Stop 
1 Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion} 








LMTPWEESNVMQDKDAPSPKPRLSPRETIFGKSEHQNSSPTCOE 
DEEDVRYNIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKFSYHPIAPKSQVYQPAKP 
TPbPRKRSEASPHEKHXS 


5461 




1422 


NSPGSVCbCQCVCPSbLHCLPPbLbbLbbPbbbHESPQPPALRV 
VATSSDRNFMNKKQKPVbTGQRFKTRKRDEKEKFEPTVFRDTLV 
QGLNEAGDDbEAVAKFbDSTGSRbDYRRYADTLFDIbVAGSMbA 
PGGTR1 DDGDKTKMTNHCVFSANEDHETI RNYAQVFNKLIRR YK 
YLEKAFEDEMKKLXjLFLKAFSETEQTKLAMLSGHjLGNGTLPAT 

i ltslftdsiivkeg i aasfavklfkawmaekbansvtsslrkan 
ldx^llelfpvnrqsvdhfakyftdaglkelsdflrvooslgtr 
keloketoerlsoecpikewlyvkeekikrndlpetavkjllwt 
cimnavevjnkkeelvaeqalkhlkoyafllavfssqgqselill 
okvoeycydni hfmkafqk i wlfy kadvlse ea i bkwykeahv 
akgksvfldqmkkfvewlqnaeeesesegeer 


5482 


1492 


528 


tkwmtgmcyapho^syingvttskpgvslvyskpsrnlslrl 
eglqekdsgpyscsvnvqdkogksrgke i ktlelnvlvppapps 
crlogvphvgwwtlscqsprskpavoyqwprolpsfqtffapa 
ldvr^gslslrnlsssmagvyvckahnevgtaqcnvtlevstgp 

GAAWAGAWGTbVGUSbbAGbVbbYHRRGKALEEPANDIKEDA 
1 APRTbPW PKSSDT1S KNGTLS S VTS ARAbRP PHG P PRPGALTP 
TPSLS SQAL PS PRLPTTDG AH POP I S PI PGGVSS SGbSRMGAVP 
VMVPAOSQAGSLV 


S483 


a 


786 


FFFFKGCRAGRGNESDYRKbEEMHQRFbVSERSKDDbObRbTRA 
ENRIKObETDSSEEISRYOEMIQKbQNVLESERENCGbVSEQRb 
KLQ0ENK0LRK3TESLRKI ALEAOKKAKVKI STMEHEFS 1 KERG 
FEVObREMEDSNRNSJVEbRHbbATQOKAANRWKEETKKLTESA 
RlRTNNbKSELSRQKLHTQEbLSQbEMANEKVAENEKLIbEHQE 
KANRLORRLSQAEERAASASQQLSVITVORRKAASbMNbENl 


5484 




1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDOENAASGSNASGS 
ESDODERGDSGQPSNKEbFGDDSEDEGASHHSGSDNHSERSDKR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDEKWGREDXSDOSDDEXIQNSDDEERAQGSDEDK 
bONSDDDEKMONTDDEERPObSDDERQObSEEEKANSDDERPVA 
SDNDDEKQNSDDEEOPQLSDEEKMONSDDERPQASDEEHRHSDD 
EEEQDHKSESARGSDSEDEVLRKKRKNAIASDSEADSDTEVPKD 
NSGTMDbFGGADD I S SG S DGEDK PPT PGQPVDENGbPQDQQEEE 
PIP ETR I EVE I PK VNTDLGNDbY F VKbPNFbS VE PRPFDPQ Y YE 
DEFEDEEMbDEEGRTRbKbKVENTIR WR I RRDEEGNEIKESNAR 
IVKWSDGSMSbHU3NEVFDVYKAPbQGDHNKbFlROGTGbC3GQA 
VFKTKbTFRPHSTDSATHRKMTbSLAPRCSKTOKIRlbPMAGRD 
PECQRTEMI KKEE ERbRAS I R R ESQQRR MREKQHQ RGbSAS YbE 
PDR YDE EEEGEES I SbAA I KNR YKGG I RE ERAR I Y SSDSD EG SE 
EDKAQRbbKAKKLTSDEVRPNbFNSRGbSCTOEPTAbNEEbTDQ 
AGTK 






1074 


KK K I LS 5 MMD£> E An EKR P P 1 L TS S KQP 1 5 PH I TNVG EM KHYbCu 
CCAAFNNVAI TFP IQKVLFROObYGI KTRDAILQLRRDG FRNLY 
RGIbPPbMOKTTTlAbMFGbYEDbSCbbHKHVSAPEFATSGVAA 
VIAGTTFAIFTPbERVQTLLQDHiCHHDKFTNTyOAFKAbKCHGI 
GEY YRGbVPI bFRNGbSNVL F FG bRG P I KEHLPTATTHS AHLVN 
DFICGGLbGAMLGFLFFPINWKTRIQSQIGGEPQSFPKVFQKI 
WLERDRKblNbFRGAHbNYHRSLI SWGI I NATYEFbbKVI 


5486 


1404 


142 


I PGSTI S WSPAAARGbS VCR C CRbH PAS AMDbFGDLPEPERS PR ~ 

GSbATSISQMVKTEGKGAKRKTSEEEKNGSEEbVEKKVCKASSV 
IFGLKGYVAERKGEREEMQDAHVIbNDITEECRPPSSLITRVSY 
FAVFDGHGGI RASKFAAQNLHQNblRKFP KGDVI S VEKTVKRCb 
bDTFXHTDEEFLKOASSQKPAWKDGSTATCVbAVDNIbYIANbG 
DSRAI bCR YNEBSQ KHAAbSLS KEHNPTO YEERMR I QKAGGNVR 
DGRVLGVbEVSRSIGDGQYKRCGVTSVPDIRRCObTPNDRFIbb 
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SEO 
ID 
NO: 


Predicted 
beg inning 
nucleotide 

1 ocaf i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondinc, 
to first, 
antino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid, F=Phenyl alanine . G=Glyc.ine, 

F-Hi<5t ddine l-l<;olpucine K-*Lvcine 
ii— *x x © l. uc / j—- 1 ijui c iiv j. i/c / i\ — wy j-i *ic / 

L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, Q=G2utamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stcp 
Codon, /^possible, nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEAVWFILSCLEDEKIOTREGKSAADARYEAAC 
NRLANKAVQRGS ADNVTVMWRI GK 


5487 


535 


162 


AVSL2QIRGLQTPAPVPLPLQPCPSNCDMERVTLALLLLAGLTA 
LEANDPFANKDDPFyYDWKNLOLSGLICGGLLAIAGIAA VLSGK 
CKCKSSQKQHS P VPEKAI PLITPGS ATTC 


5486 


1072 


255 


7virtnt5UurvI'U"vC<t v r\Jri V V V VuoU'l 1 ULivolJi olvurft loLl in 

GHKFFIGEGGKGANOCVOA^LGAMTSMVCKVGKDSFGNDY I EN 
LK.QNDI STE FTY QTKDAATGTAS 1 1 VNNEGONI IVIVAGANLLL 
NTEDLRAAAN VI S RAKVMVCQLE I TPATSLEALTMAJRRSGVKTL 
FNPAPAIADLDPOFYTLSDVFCCNESEAEJLTGLTVGSAADAGE 
AALV1»LKRG CQ W 1 1 TI.G A5GC WJjSQTEP E P KH I PTE KVKA VD 
TTVSFKI 


5489 


Q - 
OA 




LERDCRSPVEPWAAASPDLALACLCHCODLSSGAFPNRGVLGGV 
LFP7VEMVI KVF VATSSGS I AIRKKQQE WGFLEAMK1 DFKELD 
IAGDEDNRRWMRENVPGEKKP0NG3 PLPPQa FNEEQY CGDFDSF 
FSAKEEMlIYSFLGLAPPPDSKGSEKASEGGETEAQKEGSEpVG 
NLPEAOEKNEEEGETATEETEEIAT^IEGAEGEAEEEEETAEGESP 
GEDEDS 


54 90 


61 


89? 


T/i^ 1 7\ >s t> t r* m y pt T*nnvt ci f^iMitnorni/ufii t t t /''DTP t\j (/ 

GKGPVAAr IDQSNI FLTDrKI FLCj0 w Rtc.PKWPLljiwLOli JhPljK 
LERDCRSPVEPVJAAASPDLA1ACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEANKIDFKELD 
3 AGDEDNRRWMRENVPGEKKPQNG1 PLPPQI FNEEQYCGDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKAEHGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEEIAKEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


2194 


GSAPRLSLGPTGAOARDPDWWARPPSKPYT0SKEDRPDTEGRSE 
QGDMASSFLPAGAI IXsDSGGELSSGDDSGEVErPHSPEIEEi SC 
LAELFEKAAAKL0GLI0VASREQLLYLYARYKQVKVG^3aCTPK? 
SFFDFEGKQKWEAWKALGDSSPSOAMQEYIAWKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSLYHEET1REEDKNI FDYCRENN IDH 
ITKAlKSKNVDV^KDEEGRAVi.HVJACDRGHKELVTVbl^HRAI) 
INCODNEGQTALHYASACEFLDIVELLLQSGADPTLRDQDGCbP 
EEVTGCKTVSLVLORHTTGKA 


5492 


3 


1696 


AS KN PL.S AVCTTG 3 MSSLAVRDPAMDKSLRS VFVGNI PYEATEE 
QLKD I FSEVGSWS FRL VYDRE TG K PKG YGF CEYQDQETALS AM 
RNLNGREFSGRALRVDNAASEKNKEELKSLGPAAP 1 IDSPYGDP 
IDPEDAPES I TRAVASLPPEQMFELMKQMKLCVQNSHQEARNML 
LQNPQLAYALLQAQVVMRIMDPEXALKILHRKIHVTPLIPGKSO 
SVSVSGPGPGPGPGLCPGPNVLLNOQNPPAPQPQELARRPVKDI 
PPLMQTPIOGGIPAPGPIPAAVPGAGPGSLTPGGAMQPQLGMPG 
VGPVPLERGOVOKSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 

GPLGDPRLLIGEPRGPM2D0RGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGICGPGPIN1GAGGPPQGPR0VPGISGVGNPGAGM0GTG10 
GTGMQGAGIOGGGKOGAG I QGVS I OGGG I QGGG IQGAS KQGGSQ 
PSS FSPGQSQVTPODOEKAAL IMQVLQLTADQIAMI.PPEQROS 1 
LILKEQIQKSTGAE 


5493 


1 




RAPMMTKAVPEEPRKPGRLTOALNSPLTWEHVWICVPGGTPDCL 
TDTFRVKRPHZjRRSASNGHVPGTPVYREKEDMYDEIIELKKSI-H 
VQKSDVDLMRTKLRRLEEENSRKDROIEQLLDPSRGTDFVRTLA 
EKRPDASWVIKGLKOR I LXLEQOCKEKDGTI S KLQTDMKTTNLE 
EMRIA>ffiTYYEEVHRIjQTLLASSETTGKKPLGEKKTGAKRQKKK 
GSALLSLSRSVQEbTEENQSliKEDLDRVLSTSPTISKTQGYVEK 
SKPRLLRRIVELEKKLSVMESSKSHAAEPVRSHPPACIASSSAL 
HRQPRGDRNKDHERLRGAVRDLKEERTALOE0LLQRDI»EVKOLL 
QAKADLEKEXE CAREGEEERRERE EVLRE B 1 QTLTSKLQELQEK 
KKEEKSDCPEVPHKAOEL.PAPTPSSRHCEODWPPDSSEEGLPRP 
RSPCSDGRRI)AAARVLOAOWKVYKHKKKKAVLDEAAVVL<?AAFR 
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1 sso 
ID 
NO: 


Predicted 
beginning 
nucleotide 
Iocs t ion 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Predicted end 
nucleotide 
locatacr. 
corresponding 
to firs; 
amino acic 
residue ot 

sequence 


Amino ecid segment containing signal peptide 
(A-Alanine, C=Cysteine , D^Aspartic Acid, E= 
Glutamic Acid, F=P'nenyl alanine, G=Glycine, 
ii»*iist j a^ne, i = i soieucine, j\.s=i>ysine , 
L=Leucine, M-Methionine , N= Aspa rag ine, 
P=sProline, Q=Gluta:nine , R=Arginine, 
S^Serine, T=Threonine, V«Va2inc, 

n— i. ^ yp tvpuati , i^iyiL^j. lis; / — vj j.if^ii<jwj_ , —oi-^jj 

Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








GHLTRTKLI^ASKAHGSEPPSVPGLPDOSSPVPRVPSPIAQATGS 

JT V \4& K/r\X VI X y^/UUtvifl jjJ*JV/AX\r*£>/"i J. OfVlY XIX <M/n^ J X\JV,TvO/AC>^l * 

HGDASSPPFLAALPDPSPS3P0AVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


71 


53{ 


RSKAK1GTPTREVPSTDMXVRRESSSSLTHRPAPSPATPKLLGT 
RRVLLGVSEGTGCADAMELVLVFLCSLLAPMVLASAAEKEKEKD 
PFHYDYQTLR IGGLVFAWLFS VG 1 Lbl LSRRCKCS FNQKPRAP 
GDEEAQVENLI TANATEPQKAEtf 


5495 


273 


216* 


DSLLLIQVDTMPFTLHLRSRLPSA1RSLILQKKPNIRNTSSMAG 
ELRPASLWLPRSLAPAPERFCQVt^TGPLPLLGOSEPEKWMLPP 
OGAlSErRMGHPOFWKyEFGACTGSLASLEQySEQLKDMVAFFL 
GCSFSLEEALEKAGLPRRDPAGHSOAGAYKTTVPCVTHAGFCCP 
LWTMRP 1 PKDKLEGLVRACCSIjGGEQGQPVHMGDPELLGI KEL 
SKPAYGDAMVCPPGEVPVFWPSPLTSLGAVSSCETPLAFAS I PG 
CTVMTDLKDAKAPPGCLTPER1PEVHH1SQCPLHYS1ASVSAS0 
KIRELESMIGIDPGNRG1GHLLCKDELLKASLSL3HARSVLITT 
GFPTHFNHEPPEETDGPPGAVALVAFLQA1>EKEVAI3VD0RAV?N 
LKQKIVEDAVEQGVLKTOIPILTYQGGSVEAAOAFLCKNGDPQT 
PRFDHLVAIERAGRAADGNY YNARKMN3 KKLVDP I DDLFLAAKK 
1PGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACPVEADFA 
VIAGVSNWGGYALACALY1LYSCAVHSQYLRKAVGPSRAPGD0A 
WTQALPSVIKEEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEM3 QKXiVDVTTAQV 


5496 


3 


240J- 


QDTKMHEI YKGNI TPQLNKI^TLKTSAATDVWAVYFSQFWiDYEG 
MKSG KGR PISPVDS FPI£ 1 W 1 CQ PTRYAESQ KE PQTCNQVSLNT 
SOSBSSPLAGRLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 
FSPSSSEADIHLLVtIVHKHVSMOINHYQYLLIjIjFIjHESLIIjLSE 
NLRKDVEAVTC-SPASQTS 3 CI GI LLRSAELALLLHPVOQANTLK 
SP V£ ES VS P WPDYLPTENGDFLSS KR XQI SRD1 NR I RS VTVNH 
MSPKRSMSVDLSH I PLKDPLLFKSASDTNLQKG1 SFMDYLSDKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYRS 
DSNI LS FDSDGNQN I LSSTLTS KGNETI ES I FKAEDLLPEAASL 
SENLDISXEETPPVRTLKSQSSL.SGKPKERCPPNLAPLCVSYKN 
MKRSSSOMSLDTISLDSMILEEOLLESDGSDSHMFLEKGNKKNS 
TTNYRGTAESVNAGANLONYGETSPDAISTNSEGAQENHDDLMS 
VWFK1 TGVNGE 1 Dl RGEDT E I C LCVNQVT PDQLGN I S LRHYLC 
NRPVGSDQKAVI HS KSSPE I SLR FES G PGAVI HS LLAEKNGFLQ 
CHIKNFSTEFLTSSL^5NI0HFLEDETVATVMP^^KI0VSNTKIWL 
KDDSPRS STVSLEPAPVTVH2 DHLVVERSDDGSFHI RDSHMLNT 
GNDLKENVKSDSVLLTSGKYnLKKORSVTQATQTSPGVPWPSQS 
ANFPEFSFDFTREQLMEENESLKQEIiAKAKMAIiAEAHIjEKDALL 
HHIKKMTVE 


5497 


1821 


3306 


S ISKLLKRRSNIDAYLLSNE CAFFAPRL.FSLASQ3 2REQQSPNV 
CFIYKY5GFPSLECQCHFVSPHSSCYINFFSFPPPFFVCFQLSN 

GFSHYSLS se SHVGPTGAGLF p hclp asrllp rvts vhlpdyah 

YYT2GPGMFPSSQIPSWKDKAKPGPYDQPLVNTLQRRKEKREPD 
PWGGGPTTASGPPAAAEEAORPRSMrVSAATRPGEEriEACEEIA 

DYPYFSVSGDOEADQQEFDKSSTIPRNSDISQSYRRWFQAKRPA 
STAGLPTTLGPAMVTPGVATI RRTPSTKPS VRRGTIGAGPIPI K 
TPVIPVKTPTVPDLPGVLPAPPDGPEERGEHSPESPSVGEGPQG 
VTSMPSSMWSGOASVNPPLPGPKPS I PEEHRQAIPBSEAEDQER 
EPPShTVSPGQl PESDPADLS PRDTPQGEDMLNAI RRG VKLKKT 
T7NDRSAPRFS 


S498 


2434 


14S2 


ILTHQEIFTGEXPCECGKASIOMSHLSQQKIYSGENPFACKVCG 
KVFSHKSNLTEHEHFHTRSKP FECNECGKAFSQKQYVI KHQNTH 
TGEKLFECNECGKSFSOXENLLTHOKIH'XGEKPFECKDCGKAFI 
QKSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHEKIHIGEK 
PFKCSECGTAFGQKKYLIKHONIKTGEKPYECNECGKAFSQRTS 
L3VHVRIHSGDKPYECNVCGKAFS0SSSLTVHVRSHTGEKPYGC 
NECGKAFSOFSTLALHLRIHTGKKPYOCSECGKAFSOKSHHIRH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locat ion 

Luiicopunuiny 

to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponds ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D»=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=IsoIeucine, K=Lysine, 

1— T curl t/-K^ot"Vii An-i ra \l-.T\rn-i>*5rti nr. 

jj-beuexnt, I'j— s v je t niomne , j*— MGparagiiic , 
P«Proline, Q=Glut amine, R»Arginine, 
S«Serine, ?- Threonine, V* Valine, 
W=TryptOphan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\-^U9Mi>iC JJULICULlUL X ilijfcr L U J. Oil I 








QK1HTH 


5499 


324" ■ 


92* 


GFGQIGRGHK17TYPFSPRKSGRKGMAQSQGWVKRYI KAFCKGF 
FVAVPVAVTFLDRVACVARVEGASMQPSLNPGGSQSSDWLLNH 
WKVRNFEVHRGDIVSLVSPKNPE0KIIKRVIALEGD1VRTIGHK 
NRYVKVPRGK 3 WVEGDFJiGKSFDSNSFGPVSLGbLHAHATHlLW 
PrfcKWQiUjbbVIjPPkKLPvyKJitJfc. 


5500 


19V 8 


128b 


KPDWR LQNL FPRLYLWRS S R FGFGHLKKRLQMDFKI EHT WDG FP 
VKHEPVFI RLNPGDRG VMMD I SAPFFRDPPAPLGEPGKP FNELW 
DyEWEAFFLNDITEQyLEVHLCPHGQHLVLLLSGRRlWWKOEL 
PLSPRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDXRS 
YEALYPVPQHELQQGOKPDFHCLEYFKSFNFNTL^EEWKQPSS 
DLWLIEKCDJ 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFSLP 
AA1MLALISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGOFSEDMIPTVGFNMRKVTXGNVTIKIWDIGGQPRFRSMWERy 
CRGVNAI WM3 DAADREK3 EASRKELHNLLDKPQLOG I PVLV 1/3 
NKRDLPNALDE KQL I EKMNLS A IQDRE I CCYS I S CKE KDN I D I T 
LQWLI QHSKSRRS 


5502 


3 


624 


NSAFPVWVPBKTALLTCPLGAAPGSSREAPGIAGPPNSTAMSKL, 
GKFFKGGGSSXSRAAPSPQEALVRLRETEEMLGKKQEYLENRIQ 
REIALAXKHGT0NKRAAL0ALKRKKRFEK0LTQIDGTLST3EF0 
REALENSHTNTEVLRNMGFAAKAMKS VHENMDLNKI DDLMQE IT 
EQQDIAQEISEAFSQRVGFGDDFDEDELMAELEELEOEELNKKM 
TN1RLPNVPSSSLPA0PNRKPGMSSTARRSRAASS0RAEEEDDD 
IKOLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDS EDSH LGY FXMSFLLPKLTSKXEVDQA I KSTA 
EKVLVLRPGRDEDP VCLQLDD I LSKTSSDLSKMAAI YLVDVDQT 
AVYTOYFDISy I PSTVFFFNGOHMKVDYGGEDPALRS IKAVRRT 
SPAGTLGEKPVKS 


5504 


58 


3563 


QLS FS FQAPVTFDD I TVYLLQE EWVLLSQQQKELCGS N KLVAPL 
GPTVANPEI>FRKFGRGPEPWLGSVQGQRSLLEHHPGKKQMGYWG 
EM3VQGPTRESGQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRSIQKSWFVOFPWjblMNEEQTALFCSACREYPSIRDKRSRL 
I EGYTGPFKVETLK YHAKSKAK MFCVN ALAARDP 1 WAARFRS 2 R 
DPPGDVLAS PEPLFTADCP I F Y P PGPLGG FDSMAELLPSSRAEL 
EPPGGPGAI PAMYLDCI SDLRQKBJ TDGIHSSSDINI LYNDA VE 
SCIQDP S AEGLSE EVP WFEELP WFEDVAVYFTREEWGMLDKR 
QKELYR DVMRMNY ELLAS LG PAAAKPDL I S KLERRAAP WI KDPN 
GPKWGKGRPPGNKKMVAVREADTOASAADSALLPGSPVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 
FCSACI ERPNLHDKSS RLVRGY TGPFKVETLKYHEVS KAHRLCV 
NTVE I KEDTPHTALV P E I S SDLMANMEHF FNAAYS I AYHS RPLN 
DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 
VRNSPCVSVLLDSSTDASECACVGIYIRYFKQMEVKESYITLAP 
LYSETADGYFETIVSALDELDIPFRKPGWWGLGTDGSAMLSCR 
GGLVEKFQE VI PQLLP VHCVAHRLHLAWDACGS I DLV KKCDRH 
IRTVFXFyOSSNKRLNELQEGAAPLEQEIIRLKDLNAVRWVASR 
RRTLHALLVSWPAIARHLQRVAEAGGQIGHRAKGKLKLMRGFHF 
VKFCHFLLDFLS I YR P LS EVC0 KEI VL1 TE VNATLGRA YVALES 
LRHyAb P Jvb fch rNAb r KDGRLHl? I UjDIUjS v&EQR t \?ADSERTV 
LTGIEYLQQRFDADRP PQLICNMEVFDTMAWPSGIELA5 FGNTDD1 
LNLARYFECSLPTGYSEEALLEEWLGLKTIAOHLPFSMLCKNAL 
AOHCRFPLLSKLMAWVCVPISTSCCERGFKAMNRIRTDERTKL 
SNEVLNMLMMTAVNGVAVTB YDPQ PAIQHWYLTSSGRR FSHVYT 
CAQVPARSPASARLRKEEMGALYVEEPRTQKPPILPSREAAEVL 
KDCIMEPPERLLYPHTSOEAPGMS 


5505 


3312 


1219 


NCS PRSLSAAKT^SNRNNNKLPSNLPQliCKliI KRDPPAY I EEFLQ 
QY^HYKSNVEIFKLOPNKPSKELAELVMFMAQISHCYPEYLSNF 
PQEVKDLLSCNHT VLDP DLRMTFC KALIIjLRN KNLIN PS SLLEL 
FPELFRCHDKLL-lKTLYTHIVTDIKNIl^KHK^KVIWVLONFW 
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SEC 

ir 

NO: 

! 


Predicted 
beginning 
nucleotide 
1 ocs t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
>o cat ion 
corresponding 
to first 
ymino acid 
residue of 
cmino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Fhenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K»Lysine # 
L=Leucine, M=Methionine. N^Asparagine, 
F=Proline / Q«Glutamine , R=Arginine, 
S=Serine, T-Threonine, V* Valine, 
W-Tryptophan, Y-Tyrcsine, X=tJnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


■ 




\ 


VTMLRDSNATAAXKiSLDVMIEbYRRNIWNDAKTVNVITTACFSK 
VTKILVAALTFFLGKDEDSKODSDSESEDDGPTAKDLLVOyATG 
KKSSKNKKKLEKAKKVLKKHRKKKKPEVFNFSAIKLIHDPQDFA 
EKLLK0LECCKERFEVKMMLMNL1SRLVOIHELFLFNFYPFLQR 
FLOPHOREVTKI LLFAA0ASHHLVPPE2 IQSLLMTVAMvJFVTDK 

Mcrri/VTi/r'TMBT vcTPaDPDT 7nuiTT?t?T T-Atm t\ rw VTUf n t/"i\n rxi 
SXaKzbvrtl V*jIivA±J\.fc,J. lJ\K^VLiHML&t*Lil^UL&\JX JVIIiIVJKJn VM 

MS ARTL I HL FRTLN PQMLQ K K FRGK PTEAS I EARVQEYG E LDAK 
DY1 PGAEVLEVEKEENAEKDEDGWESTSLSEEEDADGEVJIDVQH 
£SDEEQQEISKKLNSMP^5EERKAKAAAISTSRVLTQEDF0KIRM 
AOMRKEIjDAAPGKSOKRKYTEIDSDEEPRGELLSLRDIERLHKK 
PKSDKETRLATAMAGKTDRKEFVRKKTKTNPFSSSTNKEKKKOK 
NFMMMRYS0NVR5KNKRSFREKQLALRDALLKKKKRMK 


5506 


1 


1531 


FRGDLCGORGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWSPLEGGGDEELRPNPYVRFPYRWWAVW 
LAAF P S LGAGGET P E AP PES WTQLWF FR FWN AAGY AS F MV PGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKA5DEVPLAPRT 
EAAETTPMWQAJLKLLFCATGLQVS YLTWGVLQER VMTRS YG ATA 
TS PG ER FTD SO FLVXjMNR^/Ij AL I VAGLS CVIjC KQ PRHG A P M YRY 
SFASLSNVLSSWCOYEAliKFVSFPTOVIaAKASKVIPVMLMGKLV 
SRRSYEHWEYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAGYlAFDSFrSNWODALFAYXMSSVQMMFGVNFFSCLFT\/GSL 
LEQGALLEGTRFMGRH S E FAAHALLLS I CS ACGQLF I FY ?I GQF 
GAAVFTI I MTLRQAFAI l»L S CLLYGHTVTWGGIiGVAVV FAALL 
LRVYARGRLKQRGKKAVPVESPVQKV 


5507 

1 

1 

i 

i 

! 

! 
I 


3704 


127: 


PRGTRRCRFAGRASRRARRRPPCPGPAAPGSLEIGGFGTAAGKK 
VAVADVOPGPMRFHQDOLOVLLVFTKEDNQCNGFCRACEKAGFK 
CTVTKEAOAVLACFliDKHHDI 1 1 IDHKSPRQLDAEALCR S I RSS 
K1.SENTVIVGWRRVDREELSVMPF2SAGFTRRYVENPN3MACY 
NELIiQLEFGEVRSQLKLRACMSVFTALENSEDAIEITSEDRFIQ 
YANPAFETTMGYQSGEL1GKELGEVP INEKKADLLDT INSCIRI 
GKEWQGI YYAKKKNGDNIOONVKI I PV I GQGGK I RHYVS 1 1 RVC 
NGNNKAEKISBCV0SDTHTDN0TGKHKDRRKGSLDVKAV7iSRAT 
EVSSQRRHSSMARIHSMTIEAPITKVIN1INAAQESSPMPVTEA 
LDRVLE I LRTTELYSPQFGAXDDDPHANDbVGGLMSDGLRRLSG 
NEYVLSTKNTQMVSSNI ITPI SbDDVPPRIARAMENEEYWDFDI 
FELEAATHNRPLIYLGLKMFARFG2CEFLHCSESTLRSWLQIIE 
ANYHS S N PYHN S TK S AD VLHATAYFLS KER I KETLDP I DE VAAL 
IAAT1HDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGDDKCNI FKNMERNPYRTLRQGI I DMVLATEMTKHFEHVN 
KFVNS INKPLATLEENGETDKNQEVINTMLRTPENRTI>I KRMLI 

FDRNTCS I PKSOISFIDYFITDMFDAWDAFVDLPDLMQHLDNWF 
KYWKGL.DEMKLRNLRPPPE 


5508 


1151 


691 


LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGLRGFPN 
VLKKVLVD0LVASPLLGVWYFW5LGCLEGQTVGESC0ELKEKFW 

Cji, I A/U/Vfl. v nrnrtVf vnrjjf v rryrn Vil J. jv o jlj i AjOrTLsi iiiO J J-> 

KYRSPVPLTPPGCVALDTRAD 


5509 


1236 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKLLKOVDFLNWE 
VTDHNLH E LRVLRRYRLQRREDY TRYNQLS RAVRELAPJRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLOAAVAFVEQGHVRVGPDWTDPAFLVTRSM 
EDFVTWVDS S KI KRHVkEWEERDDFDLEA 


5510 


96 


1155 


PAGAHI.SSGSSEPLVEPGRGRVGARVKGERGU7ASGSAPGRSKM 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWKRSQAY 
ADYIGF I LT bNEGV KGKKLT FE YR VSE AI E KLVALLNTLDRW I D 
ETPPVDOPSRFGNKAYRTWYAKLDEEAENLVATWPTHLAAAVP 
EVAVYLKESVGNSTRIDYGTGHEAAFAAFLCTLCKIGVLRVDDQ 
lAIVTKVFNRYLEVMRKuQKTYRMEPAGSQGVWGDDDFQFLPFI 
V7GSSQLIDHPYLEPRHFVDEKAVNENHKDYMFLEC11jFITEMK7 
GPFAEHSNOX.WNISAVPSWSKVNOGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 
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SEQ 
ID 
MO: 


Predicted 
beginnina 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
CO first 
amino acid 
residue of 
amine acid 
sequence 


Amino acid segment containing signal peptide 
(A« Alanine. OCysteine, D=Aspartic Acic, Es 
Glutamic Acid, ^Phenylalanine, G=Glycina f 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, W^Methionine, N«=Asparaoine , 
P=Proline, Q=Glutamine, R=Arginine ( 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X-UnWnown . *=stop 
Codor,, /=possible nucleotide deletion. 
\spossible nucleotide insertion) 


5511 


276 


1580 


KLSRVLNLP PENL ITSISAVPIS QKEEVADFQLS VDSL LE KDND 
HSRPDIQVOAKRIAEKLRCDTVVSEISTGQRTVNFKINRELLTK 
TVLOOVIEDGSKYGLKSELFSGLPQKKIWEFSSPNVAKKFHVG 
HliRSTJ 3 GNF1A^LK2ALGHQVIRINYLGDWGM0FGLLGTGFQL 
FGYEEKLOSNPLOKbFEVYVOVNKBAADDXSVAKAAOEFFORLE 
LGDVOALSLWOKFRDLSIEEYIRVYKRLGVYFDEYSGESFYREK 
SQEVLKbhZSKGLLbKTl KGTAWDLSGNGDPSS I CTVmSVGT 
SLYATRDLAAAI DRMDKYNFDTMI YVTDKGQKKHFQQVFQMLKI 
MGYDWAERC0HVPFGWQGMKTRRGDVTFLEDVLNE1OLRMLQN 
MAS I KTTKELKNPOETAERVGLAALI I QDFKGLLLSDYXFSWDR 
VFOSRGDTGVFLOYTHARLKSLEETFGCGYLNDFNTACLOEPQS 
VS I LOKLLR FDE VLYKSSQPFQPRHI VSYLLTLSHLAA VAHKTL 
OIKDSPPEVAGARLKLFKAVRSVloANGMKLLGITPVCRK 


5512 


120 


1015 


DPSL.LLT2TVTGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPL 
1 EKEK3 SHNTRRFRFGLPSPDHVLGLPVGNYVQLLAK3 DNELW 
RAYTPVSSDDDRGFVDHIKIYFKNVHPOYPEGGKMTOY^ENMK 
IGETIFFRGPRGRLFYJIGPGNLGIRPDQTSEPKKTLADHLGMIA 
GGTGITPMLOLI RK3 TKDPSDRTRMSLI FANQTEEDI LVRKELE 
EIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLI LVCGPPPLI 0TAAHPNLBKLGYTCDM2 FTY 


5513 




637 


ARWRLPSDSPR I PPAGAETPGRGSCRNYI»PSSSPPPPEPSSFPS 
PPTSRGGPGSRDTMSDSEEESQDRQLKIWLGDGASGK7SLTTC 
FAOET FG KQ Y KQTI GhT) FFLRR 1 TLPGMLNVTLQ 1 WD 1 GG QTl G 
GKMLDKYIYGAQGVLLVYDITKYQSFENLEDWYTWKKVSEESE 
TQPLVALVGNKI DLEHMRTI KPEKHLRFCQENGFSSHFVSAKTG 
DSVFLCFQKVAAEILG I KLNKAEI EOSQRVVKADI VNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 




VNRPSVJ3 KGNFRGHALPGTFFFI IGLWWCTKSIIjKYI CKKQKRT j 
CYLG S KTLFYRIjEI LEG I T I VGMALTGMAGEQFI PGCPKLMI>YD 

ykogkvjnollgwhhftmyfffgllgvadilcftisslpvsltkl 
mlsnalfveafi fynhthgremld3 fvhollvlwfltglvafl 
eflvrnnvllelilrsslillqgswffqigfvlyppsggpawdlm 

DHENI LFLTI CFCWHYAV7IVIVGMNYAPITWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 


5515 


1572 


260 


FVRLVGRGDCDPLLSVCLTTMPL.YEGLGSGGEKTAWIDLGEAF 
TKCGFAGETGPRCI I PSVI KRAGMPKPVRWQYN INTEELYSYb 
KEFIHILYFRHLLVNPRDRRWIIESVLCPSHFRETLTRVLFXY 
FEVPSVLIAPSHLMAl/LTLGINSAMVltDCGYRESLVLPl YEGIP 
^CWGALPLGGKALHKELETQLLEQCTVDTSVAKEOSLPSVMG 
SVFEGX^LEDIKARTCFV5JDLKRGIjKI0AAKFWJDGNNEKPSPPP 
NVDYPLDGEKlbHI LGS IRDS WEILFEQDNEEQSVATLI LDSL 
I QCP I DTRKOLAE NLW I GGTS MLFG FLHRLLAE I R YLVE K PKY 
KKALGTKTFR3HTPPAKANCVAWLGGAIFGALQDILGSRSVSKE 
YYNQTGRIPDVJCSLNNPPLEMMFDVGKTQPPLKKRAFSTEK 


5516 


2 


735 


KSREPP(?AGPGPSPRKSPTAS"SFLF>WRPIiASSFWMGAQGA0E5 
I KAMWRVPGTTRRPVTGES PGMHRPEAMIjLLLTIiALLGGFTWAG 
KMyGPGGGKYFSTTEDYDHEITGLRVSVGLLuVKSVQVKLGDSW 
DVKLG ALGGNTQE VTLQFGEY I TKVF VAFOAFLRGMVMYTS KDR 
Y FYFGKLDG03 SSAYPSOEGQVI.VG1YGQYQL1X5I KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR ■ 


5517 


24 6 


4S9 


S E I YVAMR TDSS KMTDVESGVANFASS ARAGRRNALPDI QSSAA 
TDGTSDLPLKLEALSVKEDAKEKDEKTTQDQLBKPONEEK 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPCLWLGLlibPLVAAIibFNYHRQEGMEA 
FLKTVAON Y5 S VTHLHS 3 GKSVKGRNLWVLWGRFPKEHR1G3 P 
EFKYVANMHGDETVGR SLLLHLIDYLVTSDGKDPE1 TNLINSTR 
IHIMPSMNPDGFSAVKKPDCYYSIGRENYNQYDLNRNFPDAFEY 
raTOR0PETVAVMKWbKTETI^5ANLKGGALVA5yPFrNGV0A 
TGALY S R S LT PDDD V FQY LAHT YAS RNPNMKKGDEC KN KMNF PN 
GVTNGYSWYPLOGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKASLIEY3 KQVHLGVXGQVFDQNGNPLPNV3 VEVQDRK 
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SHQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Atuino acid seocnent containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acic, ?=Phenylalanine. G-Glycine, 
K=Kistidine, 3 =Isoleucine, K=bysine, 
L=Leucine, K-Methionine, N-Asparagine, 
P^Proline, O^Gju?: amine, R^Arginine, 
S^Serine, T=Threonine, V*Valine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








H:C?YRTNKYGEVYbLl.IjPGSyiIKVTVPGHD?HlTKVnPEKS 
ON ?S ALKKD1 LLP FQGQLDS 1 PVSNPSCPMI PLYTcNLP DKS AAT 
KPSLFLFLVSLLK2 FFK 


5519 


87 


" 477 


I KSKLNQQVEVQEE EWRLTEAKGPTMGKESGWDSGRAAVAAWG 
G VVAVGTVLVALS AMG FTSVG I AASS I AAKMMS TAA 1 ANGGGVA 
AG S LVA I LQS VGAAGLS VTSKVI GGFAGTALGA WIGS PPSS 


5520 


117 


943 


PTEGRQKVLKTFTVPRSAbANTKTSTCIYHFLVLSWYTFLNYYI 
SOEGKDEVKFKlLANGARWKYMTbLNLLLQTIFyGVTCLDDVLK 
RTXGGKD1 KFLT AF RDLLFTTLAFP VST FV FLAF W 1 LFLYNRDL 
1YPKVLDTVIPVWLNHAKHTFIFPITLAEWLRPHSYPSKKTGL 
TLLAAAS I AY I SRI L WLYFETGTWVY P V FAKLS LLG LAAF FSLS 
YVFrASIYbbGEKLNHWKWVSVQILQRWRLESVGlCFOWPDWKS 
PAKHQLVKNIR 


5521 


54 1 


911 


K1LNMQKSCEEMEGKPQNMPKAEEDRPLEDVPQEAEGNP0PSEE 
GVSQEAEGNPRGGPNOPGOGFKEDTPVRHLDPEEMIRGVDELER 
LREE I RRVRN K F VKKKW KQRH S R SRP Y PVCFRP 


5522 


1224 


637 


GSRPLGCRSREK-^WVFGYGSLIWKVDFPYODKLVGYITNYSRRF 
WQGS TDH RG VPGK PG K WTLVE D PAG CVWG V AY R LPVG K E E E VK 
AYLDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYLG 
PAPLEDIAEOIFNW^GPSGRNTEYLFELAKSIRNLVPEKADEHL 
FALE KbVKERLEGKQKLNC I 


5523 


3 


1280 


SKGKKRMGSSMSAATARRPVFDDKEDVNFDHFQILRAIGKGSFG 
KVC I VQKRDTE KNi Y AM K Y MNKQQC I ERDEVRNV F RELE I LQE IE 
HVFLVNLWYSFODEEDMFWWDbLliGGDLRYHLQQNVQFSEDTV 
RI/Y W CFMRLRI ,r>Yl .TRf^OWT T HRriVKPtttaiT .I.DFRGRAWI/TnFiai 

J _i UJQ» in l.\r\2JkJ J JUJt-VJj Vii X -A iU\U V 1\JT xJlv JL UJUl^ J^rv VJilJ^ri Jw i Uir 111 j. 

ATI 2 KDGERATALSGTKPYMAPEIFHSFVJ8GGTGYSFEVPWWSV 
GVMAYELLRGWRPYDIKSSNAVESLVOLFSTVSVQYVPTV7SKEM 
VALLRKLLTVNPZHULSSLQDVQAAPAUiGVbVJDKLSEKRVEPG 
FVPNKGRLHCDPTFELEEMILESRPLHKKKKRLAKNKSRDNSRB 
S5QSENDYLQDCLDA1 QQDFVI FNREKLKRSQDLPREPLPAPES 
RDAAE P VEDE AERS AL PMCG P 1 CPSAG SG 


5524 


86 


2318 


RERERDHRPGESSOGOSGAGGCFPSPTMELRCGGLLFSSRFDSG 
N LAH VEKVES LS S DGEG VGGG AS ALTSG I ASS P D Y E FN VW TR PD 
CAETEFENGNRSV? FY FSVRGGMPGKLI KINIMNMNKOSKLY SQS 
MAPFVRTLPTRPRWERIRDRPTFEMTETQFVLSFV^RFVEGRGA 
TTFFAFCYPFSY S DCQELU3QLDQRFPENKPTHS S PLDT1YYHR 
ELLCYSLDCLRVDLLTXr$CHGhREDREPnUEQhF?DTSTPnPP 
KFAGKRIFFXSSRVKPGETPSSFVFNGFLDFILRFDDPRAOTLR 
RLF\^KLIFMLNPIX?V^GHYRTDSRGVNLNRQYLKPDAVLK?A 
IYGAKAVLLYH1T/HSRLNSQSSSEH0PSSCLPPDAPVSDLEKAN 
NLONEAQ CGKSADR HNAE A WKQTEPAE QKLNS VW I M PQQS AGI>E 
ESAPDTI PPKESGVAYYVDLHGHASKRGCFMYGNS FSDESTQVE 
KMLYPKL1SLNSAHFDF0GCNFSEKMMYARDRRDG0SKEGSGRV 
Al Y KASG 1 I H S YT LE CN YN7GRS VNS 1 PAACHDNGRASPP P P PA 
FPSRYTVELFEOVGRAMAlAALDMAECNPWPRJVbSEHSSLTOL 
RAKMLK1TV^SRGLSSTLNVGVNKKRGLRTPPKSKI>JGLPVSCSE 
NTLSRARSFSTGTSAGGSSSSOQNSPQMKNSPSFPFHGSRPAGL 
PGLGSST0KVTHRVLGPVFGKPVWEPLOHVFGCI.GHCWGK 


5525 


105 


834 


SNTLDFERHLF1KG0OISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEE FLGRVAELNDVTAKVASGQEKKLjL FEVQ PG S DSS AF WKV 
VVRWCTKINKSSGIVEASRIMNLYQFIQLYKDITSOAAGVLAQ 
SSTSEEPDENSSSVTSCQASLWMGRVKQLTDEEECCICMDGRAD 
LI LPCAKSFCOKCI DKWSDRHRNCPI CRLQMTGANESWVVSPAP 
TEDEMAN yi LNMADEAGQPHRP 


5526 


3 


853 


RRPCKPVRAAKRTGAAARAPRGLEVTMLRVAWR'J'LSl,IR77xAVT 
QVLVPGLPGGGSA K FP FNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDP P PSTLLKD YQNVPG I EKVDDW KRLLS LEMANK KEMLKI 
K0ECFMKKIVANPEDTRSLEARIIALSVKIRSYEEHLEKI1RKDK 
AHKRYLli^SlDQRKKMLKNLRNTNYDVFEKlCWGLGIEYTFPPL 
YYRRAHRRFNTTKKALCIRVFQETOKLKKRRRALKAAAAAQKOAX 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firs: 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, B- 
Glutamic Acid, F=Pbenylalanine, G=Glycine, 
H^Kistidine, l^Isoieucine, K=i»ysinfc, 
L= Leucine, M=Methionine, N=Asparagine, 
P^Froline, Q^Glut amine, R=Arginine, 
S=Serine, TVThreonir.e , V=Valine, 
^Tryptophan, Y=Tyrosine, X= Unknown, **Stop 
Cocon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAIPXTLKDSQ 


5527 


3225 


565 


Ll»RKyLbHON?LLl»RHQPNRTCIS?SATMKLKDTKSRPKOSSCG 
KFCTKGIKWGKWKEVKIDFNMFADGCMDDLVCFEELTDyQLVS 
PAKKPSSLFSKEAPKRKAOAVSEEEEEEEGKSSSPKKKIKLKKS 
KNVATEGTSTQKEFEVKDPELEAQGDDKVCDDPEAGEMTSENLV 
g?A?KKKKNKGKKGLEPSQSTAAKVPKKAKTWIPEVHDQKADVS 
AWKDLFVPRPVLRALSFLGFSAPTPIOALTLAPAIRDKLDXLGA 
AETGSGKTLAFAIFMIHAVLOWOKRNAAPPPSNTEAPPGETRTE 
AGAETRSPGKAEAESDALPCDTVIESEALPSBIAAEARAKTGGT 
VSDOALLFGDDDAGEGPSSLIREKPVPKQNEWEEENLDKEQTGN 
LKQELDDKSATCKAYPKRPLLGLVLTPTREl^VQVKOHIDAVAR 
FTGI KTA1LVGGMST0KQ0RMLNRRPEI WATPGRLWELI KEKH 
YHLRNLRQLRCLWDEADRMVEKGHFAELSOLIiEMLNDSOYNPK 
ROTIiVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKPKVIDLTRNEATVETLTETKIHCETDEKDFYLYYFLMOYPG 
RSJjVTANSl SCI KRliSGbbKVLDIMPLTLHACMHQKQRLRNLiEQ 
FAR LEDCVLLATDVAARGLDI PKVQHVI K YQVPRTS E I YVHRSG 
RTARATOEGLSLML1GPEDVINPKKIYKTLKKDED3PLFPVQTK 
YMDWKERI RLARQIEKSEYRNFQACLHNSW1 EOAAAALEIELE 
EDMYKGGKADQOEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 
YPTOSGXPPLLVSAPSKSESA^SCLSKQXKKXTKKPKEPQPEQP 
QPSTSAN 


5528 




855 


fZpFi c-nfRMVJftiariA/KVHrmT.ATT T TIiRRYl.RKIATMAlC^KFF 
vr r -ijo/*wx\i*iTf yrtvi\v *\ v nvoyn j ioi i x u ^ l j _~ r\ 1 1 i\o csx w 

Y VR DFEADDTCLAHCWWVRLDGKNFHR FAEKHNFAXPNDSRAi 
OLK?KCAOT\ r MEELEDlVIAYGOSDEYSFVFKRKTNVJFKRPJVSK 
FKTHVASOFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNCT 
LKDYLSWROADCHINNLYKTVFWALIOQSGLTPVOAOGRLQGTL 
AAD KN E I LF S E FN 1 N Y E PPM YR KG T V L I W Q KV DEVMT KE I Ki> 
PTEMEGKKMAVTRTRTKPCKPSKLPRAPCLRWL 


5523 


4£ 


640 


TFRLVSAHLKTRKLINPEAAERRWRDWDSROGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAAR0PR3MEEXALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 
IGLCLRVKLORCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAA>3EN PNLR E I VEOCVLE PD 


5530 


454: 


2606 


A0I VHAI SYCHKLHVGHRDLKPENWFFEKQGIjVKLTDFGFSNK 
FOPGKKLTTSCGSLAYSAPEILLGDEYDAPAVDIWSLGVILFML 
VCGOPPFOEANTDSErLTMIMDCKYTVPSHVSKECKDLlTRMLQR 
DPKRRASuEElEKHPMLQGVDPSPATKYNIPLVSYKNLSEEEHtJ 
S 1 1 0RMVLGD 1 ADRDAIVEALETNRYlvHITAT YFLLAERI LREK 

ofxei ornsASPSm kaqfrqswptk i dvpqdleddltatplsh 

ATVP0SPAPJlAI)SVl^GHRSKGLCDSAKja3DLPElAGpALSTVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
WNRLTSRJ<£APVI;NQIFEEGESDI)EFDMDENLPPKLSFLKMNI 
ASPG7VHKFYHRRKS0GRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMSLCLGSQLHGSTKY1 1 DPQNGLS FSS VKVQEKSTWXMCI SST 
GNAGQV PAVGC-I KFFSDHMADTTTELERI KSKNLKNNVLQLPLC 
EKTISVNIORNPKEGLLCASSPASCCHVI 


5531 


24 


515 


GSOPRAPRPRDSMERPEPELIROSWRAVSRSPLEHGTVIiFARLF 
ALEPDLliPLFQ YNCRQFSS PEHCT .SS PE FLDH I RKVMLV I DAAV 
TNVEDLSSLEEYIoASLGRKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSOLYGAWQAMSRGWDGE 


5532 


3395 


1402 


SDWFiWGKRKMIIEDETEFCGEELLHSVLQCKSVFDVLDGEEMR 
RARTRANPYEK1 RGVFFT J NRAAMKMA>JT'5DFVFDRMFTNPRDSYG 
KPLiVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTXjK 
GPNDFKLEDFYSASSELFEPYYGEGGIDGDGDITRPBNISAFRN 
FVLDNTDRKGVKFU4ADGGFSVEGQENL0EILSKQLLLCQFLMA 
LSIVKTGGHFlCKTFDLFTPFSVGLVi'LLYCCFERVCLFKPITS 
RPANSERYWCKGLKVGIDDVRDYLFAVN I KLNOLRNTDSDVNL 
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SEC? 
ID 
NO: 


Precictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
resicue of 
amino acid 
sequence 


Predicted end 
nucleotsioc- 
locatior. 
corresponding 
to first 
amino acic 
residue c: 
amino acic 
sequence 


Ammo acic segment containing signal peptide 1 
(A*Alanine, C=Cysteine, D=Aspartic Acid, E= 1 
Glutamic Acid,. F=Phenylaianine , GssGlycine, j 
HsHisticline , ieisoieucme, K^Lysine, 1 
L^Leucine, M=Methionine , N=Aeparagine, 
P^Proline, Q=Glutamane, R*Arginine, 
S=Serine, ?= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Enknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion] 








WPLEV IKGDHEFTDYMI RSNESHCSLQ J KAIAK1 HAFVQDTTL | 
SEPRQAE1RKECLR1WGIPDQARVAPSSSDPKSKFFELIQGTE2 1 
DI rbYKPTLIjTSKTLEKIKPVrDyK^ 

YTWDGRQSDRWI KbDLKTEbPRDTLLSVE 1 VHELKGEGKAQRK1 
S Al }i I LDVLVLNGTDVREQB FNQR 1 QLAE KFV KAVSXPS RPDMN 
PIRVKEVYRLEEMEKI FVRLEMK1 I KGSSGTPKLSYTGRDVRHF 
VPMGLYIVRTVWEPWTMGFSKSFKKKFFYNKKTKDSTFDLPADS 
IAPFHICYYGRLFVJEWGDGIRVHDSQKPOD0DKLSKEDVLSFIQ 1 
MHRA 


5533 


94 


78^ 


MKBRRAPQPWARCKLVLVGDVOCGKTAMLOVbAKDCYPETYVP j 
TVFENYTACLETEEQRVELSLWDTSGSPYYDNVRPLCYSDSDAV ! 
LLC F DI SRPET VDS ALKKW RTEILDYCPSTRVLLI GCKTBLRTD 
LSTLMELSHQKOAPISYEQGCAIAKQLGPEIYLEGSAFTSEKSI : 
HSIFRTASMLCLNKPSPLPQKSPVRSESKRLLHLPSRSELISPT , 
FKKEKAKXCS I M ! 


5534 


3 


60! 


LVRGRARAANPGRVGAMDGLRORVEHFLEORNLVTEVLGALEAK I 
TG VE KR YLAAG AVT LLS LY LL» FG YG AS LECNE I G FVY PA YAS I K ' 
AlESPSKDDDTVWLTYWVVYALFGLAEFFSDLIibSWFPFYYVGK 
CAFLLFC^PRPWNGALMLYQRWRPLFLRKHGAVDRIMNBLSG 
RALDAAAGITRNVXPSQTPQPKDK 


5535 _ 


1029 


335 


KSFMDSEARLCSLVELSDTODETOKSDSENEDLKIDCLQESQEL 
NLQKLKNSER3 LTEAKQKMRELTVNl KMKEDL1KELIKTGNDAK 
SVSKQYTbKVTKLEHDAEQAKVELTETOKOLOELENKDliSDVAM 
KVKL0KEFRKK\T)AAKLRVQVLOKKQ0DSKKIiASbSIQNEXRAN 
ELEOSVDHMKYOKIQLORKljOEENEKRKObDAVI KRDQQKI KVI 
LSYI PAKYNMKC 


5536 


942 


28", 


AAATAASLSPRGCRLRTPSSDVSPSRAPPPSAAPLPTGRAQMSP 
SGRLCLLTIVGLILPTRGQTLKDTTSSSSADATIMDIQVPTRAP 
DAVYTELQPTSPTPTWFADETPOPOTGTQOLEGTDGPLVTCPET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGPHEDDPF 
F YDEHTLRKRG LliVAAVLF I TG 1 1 J LTSGKCRQESRLCRNHCR 


5537 


3 


2391 


KAR VSS PQLR VFR SGR PRRLR VLR 1 NRTS VA LRLAGTGRFVAXT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAOKNLYQDVMLEN 
YRNLVSLGLWSKPDLITFLEORKEPVINVKSEETVAIQPDVFSH 
YNKDLLTEHCTE AS F0KVI SRRHGSCDLENLHLRKRWKREECEG 
ilNGCYDEKTFKYDOFDESSVESLFHQOILSSCAKSYNFDQYRKV 
FTHSS LLNQQEE I DI WGKHH I Y DKTS VLFRQVS TENS YRNVFI G 
EKNYKCNNSEKTLNOS SSP KNHQENY FLEKOYKCKE FSEVFLQS 
MHGOEKOEQSYXCNKCVEVCTOSLKM 1 QHQTIH1 RENSYSTOKY 
DKDLSOSSNLRKOUHNEEKPyKCEKCGDSLNHSLHLTQHQIIP 
TEEKPYKWKECGKVFNLNCSLYLTKQQQIDTGENLYKCKACSKS 
FTRSSNLIVHQR1HTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 
EXPYKCKECGKAFNRSSCLTOHOTTHTGEKLYKCKVCSKSYARS 
SNLIMHQRVHTGEKPYKCKhCGKW 

KCKVCAKPFTCFSNL1VHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTKTGEKPYTCKE 
CGKAFSYSSDVI 0HRR1HTG0R ?Y KCEECGKAFT5YRSYLTTHQR 
SHTGERPYKCEECGKAFNSRSYJjTTKRRRHTGERPYKCDECGKA 
FSYRSYETTHRRSHSGERPYKCEECX5KAFNSRSYLIAHQRSHTR 
EXL 


5538 


926 


163 


KSMMMKI PWGSI PVLMLLELLGLID1 SQAQLSCTGPPAIPGI PG 
IPGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGPMGPKGGPGAPGAPGPXGESGDYKATQKIAFSATRTI 
WVPLRRDQTIRFDHVI TIMNNNYEPRSGXFTCKVPGLYYFTYHA 
SSRGNLCVNLMRGRERAQKWTFCDYAYKTFQVTTGGMVLiaEO 
GENVFtQATDKNSLLGMEGANS r FSGFLLFPDMEA 


5539 


38 


1258 


HRGPSGAAAPGC^pRGOALEGPRSCRRPQPMARRYDEliPHYPG 
I VDG PAAJUAS FPE TV PAVPG P YG PKR P P QPLP PGLDS DGLKREK 
DEIYGHPLFPLI4ALVFEKCELATCSPRDGAGAGI1GTPPGGDVCS 
SDSFNEDIAAFAKOVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 5 
nucleotide 
location 
corresponding 
to first 
amino acid • 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E^ 
Glutamic Acid, ^Phenylalanine , G^Glycine, 
K^Histidine , I«2 soleucine , K«Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline / O^Glut amine, R=Arginine, 
S=Serine, T=Threonine , V= Valine, 
M=Tryptophan, Y=Tyrcsine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELE KVHDLCDN FCHRY 2 TCLKGKMP 2 DLV1 EDRDGGCREDFSDY 
PASCPSLPDONNMWIRDHEDSGSVHLGTPGPSSGGLASOSGDNS 
SDQGDGLDTSVASPSSGGEDEDLDQERRRNKKRGIFPKVATOIM 
RAWLFOHLSKPYPSEEQKKQLAODTGLTILQVWNWFINfARRRIV 
QPM1 DQSNRTGOG AAFS PEGQP1GG YTETQPHVAVRPPGS VGMS 
LNLEGEWKYL 


5540 

I 

! 


148 


144C 


PPIX5AGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWR0HRG 
PSGAMPGCALPRGOALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPKRPPQPLPPGLDSDGLKRBKDE1 
YGHPLFP LLALVF EKC ELATC S PRDGAG AGLGTP PGGDVCS SDS 
FNEDNTA FAKQVRSSRPL FS S NPELDN LM I QA 1Q VLR FHLLELE 
XGKNPI DLVIEDRDGGCR EDFEDYPAS CPSLPDONNIMIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGI FPKVATN IMRAWLFQHLSHPYPSEEQXXQ 
LAQDTGLT I LOVNNWF I N ARRR I VQ PM I DQSNRTGOG AAFS PEG 
QPIGGYTETEPKVAFRAPASVGDEFGTRKEEWHYL 


5541 


148 


1440 


P P LGAGAG VHAR S P H P AR RL P LTT AGVGGRAPDLLPT PWROH RG 
PSGAAAPGCAIiPRGOALEGPRSCRRPOPMARRYDELPMYPGIVP 
GP AALAS F P ETV P AV PGP YGP HR P POP LP PGLDSDGLKREKD E I 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVR SERPLFSSNPEUDNLM I QAIOVLRFHLLELE 
KGKMPIDLVIBDRD6GCREDFEDYPASCPSLPD0NWIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDOGVGbDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEOKKQ 
LAQDTGLT I LQVNN WFI NAKR R I VQPK 2 DQSNRTGOG AAFS PEG 
QP I GG YTETEPHV AFRA PAS VGDE FGTR K E E WHYL 


) 5542 


148 


144 0 


PPU5AGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PS GAAAPG CALPRGQ ALEG P RS C RRPGPMARRYDE L PHY PG 2 VD 
G PAALAS FPETVPAVPGP YG PHR PPQPLP P GLDSDG LXREKDB 1 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNWAXQVRSERPLFSSNPELDNLMIQAIQVLRFHLLELE 
KGKMP I DLV I EDRD6GCREDFED YPAS CPS LPDQNN I W 1 RDHED 
SGS VHLGTPG PSS GGLAS QSGDNS S DOGVGLDTS VAS PS SGGED 
EDLDQEPRRNXKRG2FPXVATN1MRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLT 2 LQVNNW F I NARRR 2 VQPM2 DQSNRTGQGAAF3PEG 
OPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5543 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP 
KRPFSDSGAFWS PERRPG VLEAPRRR PVPAS FRAVP PKPTRVHG 
SSASRDRVLARTM2VADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KESRARRGPRGPSAF2PVEEVLREGAESLEQHLGLEALMSSGRV 
DNLAWMGLH PDY FTS FWRLHYLLLHTDG PLAS SWR HY 2 A I MAA 
ARHQCSYLVGSHMAEFLOTGGDPEWLLGLHRAPEKLRKLSEINK 
LLAHRPWLITKEH20A1jLKTGEKTWSLAEL1QALVLLTHCHSLS 
sfvfgcg1lpegdadgspapqaptppseqssppsrdplnnsggf 
ESARDVEALNERMQOLOESLLRDEGTSQEEMESRFELEKSESLL 
VTPSADI LEFS PH PDKLCFVEDPTFGY EDFTRRG AQAPPTFRAQ 
DYTWEDHGYSL20RLYPEGGQLLDEKF0AAYSLTYNT1AMHSGV 
DTSVLRRAI WNY I HCVFG2 R YDD YDYG E VNQLLERNLKVY 2 XTV 
ACYPEKTTRRMWLFWRHFRHSEKVIIVIJLLLLEARMQAALLYAL 
RAITRYMT 


5544 

i 

! 

f 

i 


1895 


524 


LGGLLGRORLLLRWGAGRLGAPMERHGRASATSVSSAGEOAAGD 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVESLRKKRPLFPWFGLDIGGTLVKLVYFEPKDITAEEEEEEV 
ESLKS I RKYLTSNVAYGSTGI RDVHLE LKDLTLCGRKGNLHFIR 

LQLCKLDELDCL 2 XG I LY 2 DS VGFKGR S QCY YFENPADSEKCOK 
LPFDLKNPYPLLLVNIGSGVS ILAVYS KDN YKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKYPKLVRD2YGGDYBRFG 
LPGKAVAS S FGNMMS KEKREAVS KEDLARATLIT1 TNNIGS I AR 
M CALNEN2 NOWFVGN.FLRIrJTl AMRLLAYALDYWS KGQLXALF 
SEHEGYFG AVGALLELLK1 P 
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SEQ 
ID 

HO: 


Predicted 
beginning 
nucleotide 
location 
correcponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
arrdno acid 
residue ci 
amino acid 
sequence 


Amino acid segn.ent containing signal peptide j 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E-- I 
Glutamic Acid, F =Phenylalanine, G=Glycine, ! 
H»Hi6tidine, l=Isoleucine, K^LyGine, j 
L»Leucine, H«»Kethionine, N-Asparogine, 
P=Proline, Q=Glutamine t RsArginine, 
S=Serine, T« Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *t=scop 
Codon, /=possib)e nucleotide deletion, 
\spossible nucleotide insertion) 


5545 


802 


isa 


GAM WS AGR GG AA W P V L LGL LLALLV PGGG AAKTG AELVTCG S VL 
KLLNTHIiRVRLHSHDIKYGSGSGOOSVTGVEASDDANSyWRIRG 
GSEGGCPRGSPVRCGC'AVRLTHVLTGKNLHTKKFPSPLSNNQEV 
SAFGEPGEGDDLDLV:?VRCSGCHWEREAAVRFOHVGTSVFbSVT 
G3QYGS PI RGQHEV KGMPS ANTKNTW KAMEG I F 1 KPS VE PSAGH 
DEL 


5546 


1592 


14€ 


FV PRGGHSSMGQ SGR £ RHQ KRARAQ AQLRN LE AY AAN P H S F VFT 
RGCTGRNI RQLS LD VRR VMEPLTASRLQVR K KNS LKDCVAVAGP 
LGVTHFLI LS KTETNVY FKLMR LPGG PTLT FQVXK YS LVRD WS 
SLRRHRMHEQOFAHPFbLVLNSFGPHGMHVKLMATMFONLFPSI 
NVHKVNLNTIKRCLLI DYNPDSQELDFRHYS I KWPVGASKGMK 
KLLQEKFPNMSRL0D1SELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAC^SAVRLTEIGPRMTLQLIKVQEGVGEGKVMFHS 
FVS KTEEELQAI L E AKE K KLR LKAQRQ AQQAQNVQRKQEQR E AH i 
RKKSLEGMKKAR VGGSDEEASGl PSR TASLELGEDDDEQEVDDI 
EY FCQAVGEAPSEDLF PEAKQKRLAKS PGRKRXRWEMDRGRGRL , 
CDQKFPKTKDKSQGAOARRGPRGASRDGGRGRGRGRFGKRVA 1 


5547 


1592 


146 


FV PRGGHSS MGOSGRSRHOKRARAQAQbRNLEAY AAN PHSFVFT J 
RGCTGRNIRQLSLDVRRVMEPLTASRLQVRKKNSLKDCVAVAGP 1 
LGVTHFLI LSKTETI^Tv r YFKLMRJbPGGPTLTFQVKKYSLVRDVVS ! 
SLRRHRMHEQQFAKPPLLVXKSFGPHGMHVKLMATMFQNLFPSI | 
NVHKVNLNTI KRCLL3 DYNPDSOELDFRHYSI KWPVGASRGMK 
KLLQEKFPNMS RLCD1S ELLATGAGLSBSEAEPDGDHNI TELPQ 
A VAGRGNMRAQpSA VRLTEI GPRMTLQLIKVQEG VGEGKVMFHS i 

FVS kteeeloa: le akekklrlkaqrqacxjaonvqrkqeqreah 

RKKSLEGMKKARVGGSDEEASG3 PSRTASLEJjGEDPDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKO'jCRLAKSPGRKRKRVJEMDRGRGRL 
CDQKFPKTKDKSQGAQARRG PRGASRDGGRGRGRGRPGKRVA 


5548 


1 


2153 


DQTGPPETIAFTFFRGTMEPLCPLLLVGFSLPLARALRGNETTA i 
DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 
RKQRKAWSTSDKK^PNG 1 LEEQEQQRVMLLSR SPSGPKKY FPI 
PVEHLEEEIRIRSADDCK0FREEFNSLPSGHIOGTFEIANKSEN 
RE KNR Y PNILPNDKS R V2LSQLDG2 PCSDY1NA SY1DG YKE 1GJK 
FIAAOGPKOETWDFKRMVWEOKSATIVMLTNLKERKEEKCHQY 
WPDOGCWTYGNIRVCVEDCWLVDYTIRKFCIQPOLPDGCKAPR 
LVSQLHFTSVJPDFGVPFTP I GMLKFLKKVKTLNPVHAGP1 WHC 
SAGVGRTGTFIVIDAi^iAKMHAEQKVDVFBFVSRIRNORPQMVQ 
TDMQYT F I YC ALLEY yi^YGDTELDVS SLEKHLQTMHGTTTHFDK 
I GLEEEFRKLTNVR I MKENMRTGNLPANMKKARV IQ1 1 PYDFNR 
VILSMKRGQEYTDY I NASFI DGYRQKDYFIAl'OGPLAHTVEDF'W 
RMI. WEWKSHTI VM LTE VQE REQDKCYQ YWPTEG S VTHGE I T I EI 
iCNDTLSEAIS IRDFLVTLN0P0AR0EEQVRWRQFHFHGWPEI G 
I PAEGKGM IDL3 AAVQKQQQOTGNHP ITVHCSAGAGRTGTF I AL 
SNILERVKAEGLLDVFCAVKSLRLQRPHMVQTIiEOYEFCYKWO 
DFIDIFSDYANFK 


554 9 


SIS 


256 


FEATGGKRIiAFKWAGTARHDREflAIOAKKKLTTATDPIERLRLO 
CLARGSAG1KGLGRVFRIMDDD?3NRTLDFKEFMXGL>3DYAWME 
KEEVEELFQRFDKDGNGTI DFNEFLLTLRPPKSRARKEV1 MQAF 
RKL0KTGDGVITIEDLREVYNAKHHPKYQKGEWSEEOVFRKFLD 
NFDSPYDKDGLVTPEEn^NYYAGVSAS IDTDVYFI IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTI ME FS VYQDTWMKYE YEVDKDFSSKLR INIDI 
TVAWKCQYVGADVLDJ,AFTMVASADGLVYEPTVFDI»SPQOKEWQ 
RMLQLIOSRLOEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYVNKVAGNFT4JTVGKAIPHPRGHAHLAALVTIHESYN 
FSHRIDHLSPGELVPAI INPLDGTEKJAIDHNQMFQYFITWPT 
KLHTYK I S ADTHQ FS VTERER I INHAAGSHGVSG I FMKYDLS SL 
MVTVTEEHMP FWQFFVRLCG I VGGI FSTTGMLHG IGXFI VE 1 1 C 
CRFRLGS YKPVNSVP FEDGKTDNHLPLLENNTK 


5551 


211 


1700 


MQRDHTMDY K ES CPS V S I PS S DEHREKKKRFTV Y KVLV SVGR S E 
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SEC 
ID 
NO: 

i 


Predicted 
beginning 
nucleotide 
location 
correspondinc 
to first 
amino acid 
residue of 
amino acid 
sequence. 


Predicted end 

nucleotide 

location 

rnrrp ST">OYldi fln 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acic sccment containing sicnal peptide 
(A=Alenine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 

"^nVi <it i rti 1 si «iol evici np K — T,w Tih 
■j-Leucine. M=Kethionine, N=iAsparacine. , 
P^Proline, QsGlutamina, R^Arginint, 
SsSenne, ^Threonine, V^Valine, 
K-Tryptophan, Y»Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKIPAKRIFGDNFEPDFIK 
ORRAGt.NEF3ONLVRYPELYNHPDVRAFLOMDSPXK0SDPSEDE 
DSRSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LlAKRKbDGKFYAVKVi^KKIVLlTOKEQ^IKAERIWbLKKVXH 
PFLVGLHY c ;FrTTFKL.yFVLDFVNC?Gn .FFHliORFR ^FPFHRAR 
FYAAElASAbGYLHSIKIVYRDLKPEKILLDSVGKWLTDFGLC 
KEGIA1SDTTTTFCGTPEYLAPEVIRKOPYDNTVCWWCLGAVLY 
EMLYGLPP? Y CRDVAEMYDN I LHKPLSLR PGVSL7A WS 3 LEELL 
EK "JRONR LG AKEDFLE 1QNHPFFESLS WADLVQKK3 PP P FNPNV 
AGPDD3RN?DTAFTEETVPYSVCVSSDYSIVNASVLEADDAFVG 
FS YAPPS EDLFL 


5552 


274E 


930 


LGPAAGAAl^IGKXHKKHKAEWRSSYEDyADXPLEKPljKbVLKVGG 
SEVTELSGSGKDSSYYDDRSDHERERKKEKXKKKKKKSEKEKHL 
DDEERR KRKE EKKRKR EREK CDTEGEA DDFDPGK KVE VE PP PDR 
PVRACRTOPAENESTPIOOLLEHFLRQLQRIOPHGFFAFPVTDA 
1 APGYSMI 1 KHPMDFGTMKDK1 VANEYKSVTEFKADFKLMCDfJA 
MTYNRPDTVYYKLAKKILHAGFKMMSKOAALLGNEDTAVEEPVP 
EW P VQV ETA KK S KXP £ REV 3 S CMFEP EGNACS LTDS TAEEH VL 
fvb V tHAADEAR DR I NRF LPGGKMbYI-iK RNGDOSliLi YS VVNTAEP 
DADEEETHPVDUSSLS SKLLPGFTTIjG FKDERRNKVTFLSSATT 
ALSMONNSVFGDLKSDEMELLYSAYGDETGVOCALSbQEFVKDA 
GS YS KKWDDbLDQ 3 TGG DHS RTLFQL KQRRNVPM KP PDEAKVG 
DTbRDSSSSVLEFWSMKSYPDVSVDISMLSSLGKVKKELDPDDS 
HLNLDETTKLLODLHEAQAERGGSRPSSNLSSLSNASERDQHHI* 
GS PS RLSVGEQPDVTHDPY EFLQSPEPAASAKT 


5553 


74 


1095 


LGREAVYLVS RMDG PVAEKAKQE PFH WTPLLSS WALS QVAGMP 
VFIjXCENvQPSGSr KIRGlGHFCQEMAKKGCRnLVCSSGGNAGI 
AAAYAARKLGI PAT J VLPETSTSLQWOHbQGEGAE VObTGKVKD 
EANLRAOELAKRDGWENVPPFDHPblWKGHASLVOELKAVLRTP 
PGALVLAVGGGGbLAGWAGLLEVGW0HVP3IAMETHGAHCFNA 
Al TAGXLVTLPD3 TSVAKS LGAKT VAARALECMQVCK3 H $ EWE 
DTEAVSAVQOLLDDERMbVEPACGAALAAIYSGLLRRLOAEGCL 
PPSLTSWV3VCGGNN3NSRELQALKTMU30V 


5554 


36f. 


2310 


CSGRTGGRGSLR PAENVCLTCKLSGAETRGLbCPALRTWI MKVL 
GRSFFWVLFPVbPWAVQAVEHEEVAQRV3KLKRGRGVAAM0SRQ 
WVRDS CRKLSGLLROKNAVbNKLKTAI GAVEKDVGLSDEEKLFQ 
VHTFEI FQKEI.N ESENSVFQAVYGLQRAIiQGDYKDWNMKESSR 
ORLEALRICAAlKEETEYMELbAAEKHgVEAbKNMpHONOSLSML 
DEILEDVRKAADRbEEBIEEHAFDDNKSVKGWFEAVLRVEEEE 
ANS KCN3 TKREVEDDLGLSMblDSONNOYl LTKPRDST3 PRADH 
HF3KDIVT1GKLSLPCGWLCTAIGLPTMFGY3ICGVLLGPSGLN 
S3KS 3 VQVETLGEFGVFFTLFLVGLEFS PEKLRKVWK1SLOGPC 
YMTLLMI AFGLbWGHLLRI KPTQSVF3 STCLSLSSTPLVSRFLM 

OO/iAUJ^ADfl^ui CIV J-J Lt\J i ; ±J v J, \JAJ V y^JO JUJ/ rvlVJ ir i X-jX y/iu/lo 

ASSS I WEVLR 3 IiVL3GG;3 LFSLAAVFLLCLV3 KKYLIG PYYRK 
LHMESKGNKE3L3 LGISAF3 FLMLTVTE LLDVSMELGCFIjAGAL 
VSSCG PVVTEE 3 ATS I EP 3 RDFLAI VFFAS IGLHVF? TFVA YEL 
TVLVFl J TLSVWMKFbLAALVLSLlLPRSSOY3KW3VSAGLAQV 
SiFSFVLGSRAKRAGVISREVYLLIbSVTTLSLLIaAPVbWRAAI 
TRCVPRPERRSSb 


5555 


212 


1425 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
GTKAP0NLSTFCLLLLYLIGAV3AGRDFYKILGVPRSASIKDIK 
mRKlALQLHFDRNPDDPQAQEKFQDLGAAYEVLSDSEKRKQY 
nTYftFFGI.KDfiHn^HftnT F^PFFttnFGFMFGGTPROODRNI PR 
GSD33VDLEVTLEEVYAGNFVEVVRNKPVARQAPGKRKCNCRQE 
MRTTQLGPG3FGMTQEVGCDECPNVXXVNEERTLE VE 3 EPGVRD 
GMEYPF3GEGEPHVDGSPGDLRFRIKWKHP3FERRGDDLYTNV 
T3 SbVES LVGFEKDI THLVGHKVH1SRDK I TR PGAKLWKKGEGL 
PNFDNNN3 KGSL3 3 TFDVDFPKEQLTEEAREG3KQLLKQGSVQK 
VYNGLQGY 


5556 


" " 5835 


3346 


RTRGMSKNCVPMEFEE YLbRMFQGTFYLU?KI TKCNNAHTVXSR 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 

x oca c l on 

corresponding 
to first 
amino acic 
residue oj 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alenine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acic, F=Phenylalanine. G=Glycine, 

H=Mi<;f iriin?' T — T koT ^iici nf> K^LvsiYife 

L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine , V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stcp 
Codon, /^possible nucleotide deletion. 
\=possible nucleotide insertion) 








LEELDESyiEKFTDFLRLFVSVHLRRIESySQFPWEFLTLLFK 
YTFHOPTKEGyFSCLDlWTLFLDYLTSKlKSRLGDKEAVLNRVE 
DALV LLLTE VLMR 1 QFRYNQAQLEELDDETLDDDOOT E WORV IjR 
OSLEWAKWELLPTHAFSTLFPVIiQDNLEVYLGLQOFIVTSGS 
GHRIjM I TAENDCRRLHCSLRDLSSlibCAVGRIiAEVFl GDVFAAR 
FNDALT\A/ERbVKVTLYGSOIKLYNIETAVPSVLKPDLIDVHAQ 
SIAALOAY SH WLiAQY CSEVHRQNTQQFVTbl STTMDAITPL1 ST 
XVQDKLLLSA CKLLVS LATTVR PVFL3 S I PAVQKV PNRI TDASA 
LRLVDXAQVLVCRALSNI UiPWPNLPEN EQQWPVRS INHASLI 
SALSRDVRNLKPSAVAPQRKMPLDDTKL1 3 HQTLSVLEDIVEK1 
SGESTKSR01 CYQSLQESVQVSLALFPAF 3HQSDVTDEMLSFFL 

rpv W0f»1 D\ JfSV.f^\JD'C ,p PWPkT Tl"VPTM MMPT'DPrYJ IVPCTT VJ Pf* CTrfD 

J iir KbbnvyriGVrr 1 Ayu r i«NFir l Kr. vi>"C.-> xiuifcvjo 1 
WEKFLX1 IjOVWQEPGQVFKPFLPSI I ALCMEQVY P3 1 AERPS 
PDVXAELFELLFRTLHHNWRYFFKSTVLASVQRGIAEEOMENEP 
OPSAI MOAFGOSFLOPDI HLFKONLFYLETLNT KQKLVHKKI FR 
TAMLFOFN/NVLLOVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 
AFLPE FLTS CDGVDANQKS VLGRN FKMDRVR RERGRAKR RABWA 
RKPGTCAARRGHIEASGRGLCPPCSKAAAHEMPADLVL 


5557 


1712 


491 


VI LGAGLRDKDMW I PWGLPRRLRbSALAGAGRFC I LGSEAATR 
KHbPARl^HCGLSDSSPQLWPEPPFRNPPRKASKASLDFKRYVTD 
RRliASTLAQ 1 Y LGKPS R P PHLLLECN PGPG I LTQALLEAG AKW 
ALESDKTFlPHLESLGKl^lXJKIjRVIHCtJr ^KLDPHb^GVI^U , ^ , 
AMSSRGLFKNLG1EAVPWTADIPLKWGMFPSRGEKRALWKLAY 
DLYSCTS I YKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRELLDQLQOKLY 
LIQMI PROKbFTKNLTPMNYNI FFHLLKtfCFGRRSATVIDHLRS 
LTPLDARDlbMQlGKQED2KWNMHPQDFKTLFETlERSKDCAY 
KWLYDETLEDR 


5558 


150$ 


96 


RAGCl'H PQV PADLGAPAE PRRPQKTCVCLLQPQPGGORG PTTM1 
TGVFSMRLWTPVGVLTSLAYCLHQRRVAtAELQEADGQCPVDRS 
LLKLKMVOWFRHGARSPLKPLPLEEQVEV7NPQLI.EVPP0T0FD 
YTVTNLAGGPKPYSPYDSQyHETTLKGGMFAGQLTKVGMQOMFA 
L^jERLRKNYVEDXPFLSFTrTJPQLVI* IRS J In i rKN Lib is 1 Kv-JjLA 
GLFOCOKEGPI 1 1HTDEAD3EVLYPNYQSCVCSLRQRTRGRRQTA 
SLQPG I SE DLKKVKDRMG I PSSDKVPFFILL-DNVAAEOAHNLPS 
CPMLKRFARMI EQRAVDTSLY1 LPKEDRESLQMAVGPFLH I LES 
NLLKAMDSATAPDKIRKLYLYAAHDVTFIPLLMTLGIFDKKWPP 
FAVDLTXELYCHLESKEWFVQLYYHGKBOVPRGCPDGLCPLDMF 
LNAMS VYTLS PEKYHALCSOTOVMEVGNEF 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSLLETLSPEEMEELEK 
ELD\^PIX?SVPVGLRORNQTEKQ£TGVYNKEAMLNFCEKETKK 
LMQREKSMDESKOVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KEPKRGGLKKSFSRDRDEAGGKSGEKPKEEKIIRG1DKGRVRAA 
VDKKEAGfCDGRG EERA VATKKEEE KKG SDR NTGLS RDKDKKR EE 
MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVXKNEPLKEKEAKDDSKTKTPEKQTPSG 
PTKPQFfiPAKVFPS'AAP*? 7 PnFPI.FRVKNNDPEMTSVNVNNSDC 
I TKE I LVR FTEALEFNTWXLFALANTRACDKVAFAI AIMLKAN 
KTITSLNLDSNH I TGKG IliAIFRALLQNNTLTELRFHNORHI CG 
GKTEMElAKLLKEim'LLKI^YHFEJ^GPRMTVTNLLSRNMDKQ 
ROKRLOEQROAOEAKGEKKDLLE VPKAGAVAKGS PKPSFQ PS PK 
PSPKNSPKKGGAPAAPPPPPPPLAPPMMENLKNSLSPATQRKM 
GDKVLPAOEKNSRDQLLAAIRSSNI.KQLKKVEVPKIjIiQ 


5->OU 






q c:\nri? p<j ii i c?wcMZir , i cDcnr^wp/v^nrcRi.VTiFnPT.^&FFr'VaM 
QQRIGEIVABMDVPI^CRTEFSTOEEEQIjRAOGSTDYFLSSGDK 

irfffekgvfdekgnflvppeksinkighalhahdpvfksiths 
fkvqtliar s1x3lqmpwvqsmyifkqphfgg evs phodas fl yt 
epixsrv^gwiavedatlengclwfipgshtsgvsrrkvrapvg 
s afgts flgs epardnslfvptpvqrgalvlihgewhkskqnl 
sdrsrqaytfkbmeasgttwspenwlqp'taelpfpolyt 


5551 


217E 


1775 


CYF1FQFFSSPYPGLHPHQTPAPLPNPGLYPPPVSMSPGQPPPQ 
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SEQ 
ID 
WO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
tc firs: 
an-.ino acid 
reeidue of 
airino acid 
secuenc? 


Predicted end ■ Amino acid segment containing signal peptide 
nucleotide i iA»Aianine # OCysteine, D=Aspartic Acid, E= 
location ! Glutamic Acid, F=Phenylalana ne , G=Glycine, 
corresponding J H^Histicine, I^Isoleucine, K=Lysine, 
to first t L-Leucine, M-Methionine, N^Asparagine, 
amino acid | P*=Froline, Q=Glutamine , K=Argininc, 
res i doe of ! S~ Serine, T-Threonine, V=Valine, 
amino acid J W=?ryptcphan, Y^Tyrosine, XsUnknovm, *=Stop 
sequence j Codon, /^possible nucleotide del et ion, 
1 \=possible nucleotide insertion- 








0LIA?TYFSAPGVKNFGN?SYPYAPGALPPPPP?HLYPNTOAPS 
OVYGGVTYYNPAOQQVQPKPSPPRRTPOPVTZKPPPPEWSRGS 


5562 


34: 


138S 


SSGKND.WAGAAGLVRGLKAGVLSOADYLNLVOCEriLEDLKLH 
LOS ?D YGN FLANEAS PLTVS VI DDRJjKE KMWE FRHKRNHAYEP 
LAS FLD F I TY S Y MI DNV I hhl TGTLHORS I AELV PKCHPLGSFE 
CKEA VNI AQTPAELYNAI LVDTPIAAFFODCIS EQDLDEMNI EI 
IRNTLYKAYLESFYKFCTLLGGTTAPAMCPI LEFSAERRAFI IT 
IKS FGTELS KEDRAKLPPHCGRLYP EG LAQLARADDYEQVKNVA 
DYYPEYKbLFEGAGSNPGDKTLEDRFFEHEVKLNKlAFLWOFHF 
GVFYAFVKLKEOSCRNIVWIAECIAQRHRAKIDNYIPIF 


5563 


34: 


13 85 


SSGK/vWiAAGAAGLVRGLKAGVLSOADyi/NLVOCETLEDLKLH 
LOSTDYGNFLAI^EASPLTVSVIDDRLKEKKVVEFRHMRNIIAYEP 
LASFLDFITYSYMIDNVILLITGTLH0RS1ABLVPKC11PLGSFE 
OKEAVM I AQTPAELYNAl LVDTPLAAF FODC3 SEQDLDEMN I EI 
IRKTLYKAYLESFYKFCTLLGGTTADAMCPILEFEADRRAFIIT 
2 NS FGTELS KEDRAKLFPHCGRLYPEGLAOLARADDYEQVKN VA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLNKLAFIiNQFHF 
GV FYAFVKLKEQECRNI WJ I AEC1 AQRHRAK1 DN Y I PI F 


5564 




914 


RVRRDKRAVVITARGRRRCGDSMSGGWMAQVGAWRTGALGLAliLL 
LLGLGLGLFAAASPLSTPTSAQAAGPSSGSCPPTXFOCRTSGLC 
VpLTWRCDRDLDCCDGSDEEECRIEPCTQKGOCPPPPGLPCPCT 
GVSDCSGGTDKKLRNCSR1ACLAGELRCTLSDDCIPLTWRCDGH 
PDCPDSSDELGCGTNEILPEGDATTMGPPVTLESVTSLRKATTM 
GPPVT1»ESVPSVGNATSSSAGDQSGSPTAYGV1AAAAVLSASI>V 
TATlil>l?LS VJLRAOE RLR P LGLLVAMKE S LLLS EQKTS LP 


5565 




136 


R WNS PNP ARAGS I S RPQRAPGS VSAVAMTAAVFFGCAFI AFGPA 
LALYV FT I ATEPLRI IFLIAGAFFWLVSLU SSLVWF-MARVI ID 
NKDGPT0KYLL3 FGAFVSVYI QEMFRFAYYKLLKKASEGLKS IN 
PGETA PS KRLLAY VSGLGFG I MSGVFS F VNTLS DSLGPGTVG I H 
GDSPQFFLYSAFMTLVIILLHVFWGIVFFDGCEKKKWGILLIVL 
LTHLLVSAQTF3 SSYYG1NIASAFI 1LVLMGTVJAF1AAGGSCRS 
LKLCLLCQDKNFLL YNQRS R 


5566 


204 :- 


3232 


SHICKHGRGAQAPVKMVSWMISRAWLVFGMliYPAYYSYKAVKT 
KNV K E YVRWKMYW I VFALYTVI 2TVADQT WWF PLYYSLKIAFV 
IVJLLSPYTKGASLIYRKFLHPLLSSKEREIDDYIVQAKERGyET 
MVNFGROGLNLAftTAAVTAAVKSQGAITERLRSFSMHDLTTIQG 
r>EPVGORPYOPI*PEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 
EGPYEDNEMLTKKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 
RPQVYF 


5567 


1554 


233 


HFI^SGVSPDIJ^EDGLTALHOCCIDBFREMVOOLLEAGAKINA 
CDSECWTPLHAAATCGHLHLVELLIASGAJ^LAVNTDGNMPYDL 
CDDE0TLDCLETAMABRG1T0DS I EAARAVPELRMLDDIRSRLQ 
AGADLiHAPLDHGATLLHVAAANGFSEAAALIjLEHRASLSAKDQD 
GWEPLPJiAAYWGOVPLVElAVAJIGADLNAKSLMDETPLDVCGDE 
SVRAKLbELKHKHDAXjGRAOSRORSLLRRRTSSAGSRGKWRRV 
SLTORTDLYRKQHAQEAIVWQQPPPTSPEPPEDNEiDRQTGAELR 
PPPPEEDNPEWRPKNGRVGGSPVRHLYSKRLDRSVSYQLSPLD 
STTPHTLVHDKAHHTLADLKRQRAAAKI/ORPPPEGPESPETAEP 
GLPGDTVTPQPDCGFRAGGDPPLLKbTAPAVEAPVERRPCCLLM 


5568 


1733 


587 


AEDK QPAS RR GAGT TAAMAAS GPG CRS WCJjCPEVPSATFFTALIi 
SLLVSGPRLFLLOQPLAPSGLTLKSEALRNWOVYRIiVTYIFVYE 
NPISLLCGAI. I IWRFAGMFERTVGTVRHCFFTVIFAI FSAIIFL 
S FE A VSS l»SKLGEV EDARGFTPVAFAMLGVTTV R SRMRRALVFG 
MWPSVLVPWLLLGASWLIPQTSFLSNVCGLSIGLAYGLTYCYS 
IDLSERVALKLDOTFPFSIaMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGSYPTOSCHPHliSPSHPVSOTQHASGOKIaASWPSCTPGHM 
PTLP PYQPASG LCYVQNHFG PNPTS SS VY PASAGTSLG I QPPTP 
VNSPGTVySGALGTPGAAGSKESSRVPMP 


5569 


2 


835 


OTPCPLAWERGSRSEDISVPGOKPPTCSSFSGMDVGPSSLPHLG 
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SEC 
ID 

NO: 


Fredictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ol 
amino acid 
sequence 


Amino acic segment containing signal peptide j 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histicine, 1= Isoleucine , lULysine, , 
L=Leueine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, "^Threonine, v= valine, 
W=Tryptochan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








L.KLLLLLLLLPLRG0AK-TGCYG1FGMFGLPGAPGKDGYDGLPGP 
XGEPGIPAIPGIR6PKG0K6EPGLPGHPGKNGPMGPPGMPGVPG 
PMG I PGE PGEEGR Y KQKFQS VFTV7R0THQPP APNSLI RFNAVL 
TNPOGDVDTSTGKFTCKVPGLYYFYYR^SHTANLCVIjLYRSGVK 
WTFCGHTSKTNOVNSGGVLLRLOVGErVWLAVWDyYDMVGIOG 
SDSVFSGFLLFPD 


5570 


264 


946 


rdrrdrggvatsteefarprapcsrgpgpvsqtgrgrergggdt 
mss ps pg kkrmdtdwkl i es kh evtt lgglhe fwkfygpqgt 
pyeggvwkvrvdlpdkypfksps j gfmnki fhpni deasgtvcl 
dvinqtwtalydlt^ifesflpollaypnrpidplngdaaamylh 
rpebyk0kikey3qkyateealkeqeegtgdsssessmsdfsed 

EAQDMEL 


5571 


264 


94 6 


RDR R DRGG VATS TEE PAR PRA PCS R G PG P VS QTGRGRERGGGDT 
MS S P S PG KR RMDTDW KLI E S KB E VTI LGG LN E F W KFYGPQGT 
PYEGGW1KVRVDLPDKYPFKSPS 1 GFMNKI FHPN1DEASGTVCL 
PV2N0TWTALYDLTNI FESFLPOLLA Y PNPI DPLNGDAAAMYLH 
RPEEY KQKI KEYIQK YATEEALK E EEGTGDS SS ES SMSDFS ED 
EAQDMEL 


5572 


2802 


2 0S5 


RTDYRTGlPGRRFRVMAAGDGDVKLGTLiGSGSESSNDGGSESPG 
DAGAAAEGGG WAAAALALLTGGG E MLbNVALVAl,VLU3AYRLWV 
RWGRRGLGAGAGAGEESPATSLPKMKKRDFSLEOLRQYDGSRNP 
RILLAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGLATFCLD 
KDALRDEYDEL-SDLNAVOWESVREKEMOFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHMKOD 


5573 


2562 


219 


VPARTPNAEDC^PEARAATATPCCSGGRERAGEAAEDGVKMAAF 
SEMGVMPEI AQAVEEMDWLLPTDIOAES I PL I LGGGDVXJ4AAET 
GSGKTGAFS I PV1QIVYETLKD0CEGKKGKTT1KTGASVLNKWQ 
MNPYDRGSAFAIGSDGLCCQSREVKEWHGCRATKGIjMKGKHYYE 
VSCHDQGLCRVGWSTMQASIiDLGTDKFGFGFGGTGKKSHNKQFD 

NYGEEFTMHDT1GCY LD1 dkgkvkfs kngkdlglafei pphmkn 
oalfpacvlknaelkfnfgeeefkfppkdgfvalskapdgyivk 
sohsgnaovtotkflpnapkaljvepsrelaeotljjnikofkky 
i dnpklrelli 1ggvaardqlsv/lengvdi wgtpgrlddlvst 
gklnls0vrflvldeadglls0gy2 of i nrmhnqj pqvtsdgkr 
lqvivcsatlhsfdvkklseki^fptwvdlkgedsvpdtvhhv 
wpvnpktdrlwerlgkshirtddvhakdntr pgans pemwsea 
i kilkgeyavrai kehkmdoai i fcrtk i dcdnleqyfiqoggg 
pdkkghqf s cv clhgdr xpherkcnl>er f kkgdvr f1»i ctdvaa 
kgid1hgvpyv1nvtlpdekqnyvhr1 grvgraermglai slva 
tekeioa^yhvcssrgkgcyntrlkedggctikykemollseiee 
hlncti sqvepdi kvpvdefdgkvtygqkraagggs ykghvdi l 
aptvoelaalekeaqtsflhlgye-fnolrrtf 


5574 


1731 


952 


NEGLEVFKE0ELQP3DKGAVPEDA5TERSAMASLGLQLVGYILG 
LU5LLGTLVAK1.LPSWKTSSYVGAS1VTAVGFSKGLKMECATHS 
TGI TQCD I YS 1XLGLPADI QAAQ AKMVTSS A I S SLACI IS WGM 
RCTVFCOESRAXDRVAVAGGVFFILGGLLGFIPVAWNLHGILRD 
FYSPLVPDSMKFEIGEALYU3IISSLFSLIAGIILCFSCSCQRN 
RSNYYDAYOAQPLATRSSPRPGOPPKVKSEFNSYSLTGYV 


5575 


45G 


766 


LLWAUPC P PPTAAAVLLS S TGLM ELLE XMLALTLAKADS P RTAL 
LCSAWLLTASFSAQOHKGSLOKDPLLSOACVGCLEALLDYLDAR 
SPDIGRNSPHYLMFP 


5576 


249 1 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAKVGCAFVLFLFLLHRDVSSR 
EEATEKPMLKSLVSRKDHVXDLMLEAMrWLRDSMPKLQIRAPEA 
QQTLFS INQSCiiPG r Yl PAELK.P FWERP PQDPIN APG AUG KAr Q a. 
SKWTPLETOEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DOKFRRCPPLATTSVIIVFHNEAWSTLLRTVYSVLHTTPAILLX 
EIILVDDASTEEKLKEKLEQYVKOLO^^VRWRQEERKGLITARIi 
LGASVAQAS\niTFLDAHCECFHGWLEPLLARIAEDKTVVVSPDI 
VTI DLNTFEFAKP VORGRVHSRGNFDWS LTFGWETLPPHEKQRR 
KDETYPXKSPTFAGGLFS1 SKSYFEK1 GTYDNQMEI WGGENVEM 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
3 oca t ion 
cor respond inc 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid Moment containing signal peptide 
lA=Alanine, i=Cysteine, D=Aspartic Acid. Es 
Glutamic Ac:c\ Fs= Phenylalanine, G=Glycine. 
H^Histidine, 1 = lsoleAicine, K=Lysine/ 
Jj=Lev)cine, K- Methionine, N=Asparagine, 
PsProline, C -Glutamine, R=Arginine, 
S«Serine, T= Threonine, VsValine, 
W«= Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /spcs.'ible nucleotide deletion, 
\=possi.ble nucleotide insertion) 








SFRVWQCGGQLF.: I PCS WGHVr RTKSPHTFPKGTSVIARNQVR 
LAEVWMDSYKJG V V R RNLQAAK14AQEXS FGDI S ERLQLREQLH C 
HNFSWyLHNVYPl'KFVPDLTPTFYGAIKNLGTNOCLDVGEN^JRG 
GKPLIMYSCHGU-GKOy?EYTTQRDLRHNIAKQLCLHVSKGALG 
LGSCH FTGKNSC V P XDE EWELAQDQLI R1JSGSGTCLTSQDK KP A 
MAPCNPSDPHQ1V:LFV 


5577 


3 


1275 


RNSDCSCGEISVHCLPWV/bFlLDLKVESSMFCPLKLILLPVLLD 
YSLSLNDLNVSF l ELTVHVGDSALMGCVFQSTEDKCI FKIDWTL 
SPGEHAKDEYVLYYYSNLSVPIGRFQNRVHLMGDILCNDGSLLL 

DDI/DFi DfXZTYl C T p T.KttPttn WT.J-n/LPPF P K *?i ,mvHV 
y*>U " VJ-^> Wvv -i ■* J v >nliAbCAVVr t\ /Vrl V V Jjfl v LrLLr V fl V 

GGLIQMGCVFOS^ EVKHVTKVEW IFSGRRAKEEIVFR YYHKLRM 
SVEYSQSWGHFCKF: VNLVGDI FRNDGS IMLQGVRESDGGNYTCS 
ZHLGNLVFKKTJV^HVSPEEPRTLVTPAALRPLVLGGNOLVllV 
GIVCATILLLPVL ■ LIVKKTCGNKSSVNSTVLVKNTKKTNPEIK 
EKPCHFERCEGErG-IIYSPUVREVIEEEEPSEKSEATYMTMKPV 
W PSLRSDRNNS I F V K SGCICSMP K^OO AP 

>iroiji\ot/ xviy ivo . * ixoovoi ir ix* W^** 


5578 


3 


783 


AVESMAS PGACRA.* PELPERNCG YREVEYWDORYOGAADSAFYD 
VJFGDF£SFRAL»LHPEljR?EDRIIiVLGCGNSALSYELFLGGFPNV 
TSVDYSSVVVAAMOARYAHVPOLRWETMDVRKLDFPSASFDWL 
EKGTLDALLAG EK I'PWTVSSEGVHTVEQVLSE VSRVLV PGGRFI 
5MTSAAPHFRTRKYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LS VAQLALGAQ IIS PPRPPTS PCFLQDSDHEDFLSAI QL 


557S 


3 


1540 


RNSGLARGASALJ^/-.HGGGLiAGGVGWDCGACASRCOGWEGLLTR 
CRALPALATCSRC - SGYVPCR FHH CAPRRGRRLLLSRV FQPCNL 
REDRVLSLODKSDH LTCKSQRiMLOVGLl YPASPGO'KLX,PYTV 
RAMEKLVRVI D0E?<0AI GGQKVNMPS LS PAELWQATNRVJDLMG K 
ELLRLRX3RHGKEVCLGPTHEEAITALIASQKKLSYKQLPFLLY0 
VTKKrKDfcrPRP^rvrijJjKLiKtr YMKUWi lrUi»St'tAAyyj Y SLVC 
DAYCSLFKKLGLPrVKVQADVGTIGGTVSHEFQLPVDlGEDPJLA 
1 CPRCS FSANME7) PLSQMNCPACQGPLTKTKGI EVGHTF YLGT 
KYSSI FN AQFTNV CG KPTLAEP'IGC YGLGVTRI LAAA3 EVLSTED 
CVRWPS LLAPYO*- CL 2 PPKKGS KEQAASEL 3 GQLYDH 1 TE AVPQ 
LHGEVLLDDRTKL? 2 GNRLKDANKFGYPFV1 IAGKRALEDPAHF 
EVWCQNTGEVAFLTKDGVMDLLTPVQTV 


5580 




45C 


A^AGTRCIPGFWPSGAGYSAPAORGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGEl^r/JlVPAAAMGPSALGQSGPGSMAPWCSVSS 
GPSRYVLGMQELF^GHSKTREFLAHSAKVKSVAWSCCGRRLASG 

ASGDKT I R I WDVRTTKC I ATVNTKGEN INICWS PDGQT 3 AVGNK 
DDWTFIDAKTHR? KAEEQFKFEVNEI SWNNDNNMFFLTNGNGC 
INIIiSYPELKPVOf INAHPSNCICI KFDPMGKYFATGSADALVS 
LWDVDELVCVRCF5 RLDWPVRTLSFSHDGKMIASASEDHFI DI A 
EVETGDKLWEVOCFSPTFTVAWHPKRPLLAFACDDKDGKYBSSR 
EAGTVXLFGLPI?D< 


5581 


54 


94 7 


GGGSGPRAPSATLLCTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVY S PVQPGAPYGNP KNMAYTGYPTAY P AAAP A 
YNPSLY?TNSPSYA?EFOFLHSAYATLLMKOAWPONSSSCGTEG 
TFHLPVDTGTENETYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
SPSPWP YQTAMYF I R S A YPQQNL YAQG AYYTQP VYAAQPH VI HrJ 
TTWQPNSIPSA I Y P APVAAPRTNGVAtlGMVAGTTMAMS AG7LL 
TTPOHTAIGAHPVS MPTYRAQGTPAYSYVPPHW 


5582 


5775 


273S 


I ITNTWNVI I PLV3 A YHLSGSAQARGERS PAERLMERQKR KAD I 
EKGLQ F IQSTLPLKC EEY E AFLrLKLVQNLFAEGNDLFREKDY KQ 
ALVOYMEGLNVADY AASDOVAL PRELLCKLHVNRAACY FTMGLY 
EKALEDS E KALGL C £ ESI RALFR KARALN ELGRHKEAYECS SRC 
SLALPHDESVTQLGOELAOKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSKGYGSIDDIETDCYVDPRGSPALLPSTPTMPL? 
PHVLDLLAPLDSS ?. TLPSTDSLDDFSDGDVFGPELDTLLDSLSb 
VQGGLSGSGVPSEIPQLIPVFPGGTPLLPPVVGGSIPVSSPLFP 
ASFGLVMDPSKKLA^SVLDALDPPGPTLDPLDLLPYSETRLDAb 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segnent containing signal peptide 
(A=A3anine, OCysteine, D^Aspartic Acid, £= 
Glutamic Acid, F=Phenylalanine , G=Glycin<=i, 
H^Histidine, l=Isoleucine, R=Lysine, 
L=Leucine, M=Methicriine, N=Asparagine, 
P=Prcline, C=G"lutamine, R=Arginine, 
S=Serine, Tr.Threomr.e, V= Valine , 
W^Tryptophan, Y=Tyrcsine, XsUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSFGSTRGSLDKFDSFM5ETNSQDHRFPSGAQKPAP5PEPCMPN j 
TALbJKNPI^ATHEFKQACOLCYPKTGPRAGDYTYREGLEHKCK j 
RD I L LGRLRS SE DQT W KR I R P R PT KTS FVGS Y YLCKDM I N KODC | 
KYGDNCTFAYKOEEIDVKfTEERKGTLNRDLLFDPLGGVXRGSliT j 
IAKLLKEHQGI FTFLCEI CFDS KPRIIS KGTKDSPS VCSNLAAK ' 
KSFYtaNKCLVH3\^STSLKySKlR0F0EHF0FDVCRHEVRYGCL 
REDS CHFAHSF 1 ELKVWLLQOY SGMTHED1 VQ2SKKY WQQMEAH 
AGXASSSMGAPRTHGPSTFDLOMKrVCGQCWRNGCWEPDKDLK 
YCS AKARHCWT KERRVLLVMS KAKRKWVSVRPLPS 1 RHFPQQYD 
1»CI HAQNGRKCQ YVG NCS S P E ERDM WT FM K ENK 2 LDMQQT Y 
DMW LK KHN PGKPGEG T P I £ S REGEKQIQMPTDY AD 1 MMGY HCK L 
03KNSNSKKQWQQHIQSEXHKEKVFTSDSDASGWAFRFPMGEFR 
LCDRLOKGKACPDGDKCRCAHC-OEELNEWLDRREVLKOKXiAKAR 
KDMLbCPRDDDFGKYNFLLOEDGDbAGATPEAPAAAATATTGE 


5583 


3 


1265 


SSGCROGRPGRSDRPRPPPKRHKMVKETRYYDILGVKPSASPEE , 
IKKAYRKLALKYHPDKNPDSGEKFKLISQAYEVLSDPKKRDVYD j 
0GGEQA1 KEGGSGS PS FSS PMDIFDMFFGGGGRMARERRGWWV 
HQLSVTIjEDLYI'IGVTKKLALQKNVICSKCEGVGGKKGSVEKCPL 
CXGR GMH1 HI QQIG PGM V0OI QTVCI ECKGCG ER IN PKDR CES C 
SGAKVIREKXI I EVWEKGKKDGQKIL.FHGEGDQEPELEPGDVI 
I Vl>DQKDHS V FO R RGKD L> I Kj KMKIQLSEALCG F KKT 1 KTbDN R I 
LVITSKAGEVIKKGDLRCVRDEGMPIYKAPLEKGILIIQFLVIF 
PEKHWLSLEKLFOLEAJjLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHR EAYEEDEDGPCAGVQCQTA 


5584 


3 


1265 


SSGCROGRPGRSDRPRPPPRRHKMVKETRYYDI bGVKPSASPEE | 
IKKAYRKLALKYHPDKNPDEGEKFXLISQAYEVLSDPKXRDVYD 
QGG EOA I KEGGS G S P S FS S PMD 1 FDMFFGGGGRMARERRG KNW 
HOT. VT7 ,FDL YNG VTKKI. A 7 .HKNVI CEKCFGVGG KKG <> VFfCCPIi 
CKGRGKHIHI00IGPGMVQ0IQTVC1ECKGQGERINPKDRCESC 
SGAKVI REKKI 1 E VHVEKGM KDGOK I L FHGEGDQE PELE PGDVI 
I VLDQKDHSV FORRGHDL I M KMKI Q1>S E ALCG F KKT IKTLDNRI 
LV1TSKAGEVI KKGDLRCVRDEGMP I YKAPLEKGI LIIQFLVI F 
PEKHWLSLEKLFOLEAIiLPPRQKVRITDDMDQVELKEFCPNEON 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LPAGTPESSLHEA'IjDOCMTALDLFLTNOFSEALSYLKPRTKESM 
YHS LTYATI LEMOAMMTFDPODILLAGNMMKEAQ'MbCQRHRR KS 
SVTDSrSSLW^PTLGQFTEEEIHAEVCYAKCLLQRAALTFLOD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSS0YCKG3MHPHFEGG 
VKLGVGAFNLTLSMLFTR I LR LLEF VGFSGNKDYGLLQLEEGAS 
GHSFRSVLCVMLLLCYKTFLTFVLGTGNVN1EEAEXLLKPYLNR 
YPKGAI FbFLAGR I E VI KGN I DAAI RRFEECCEAQQHWKQFHHM 
CYWEUWCFTY KGQWKMS Y FY ADLLSKENCWSKATY I YMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFA1RKS 
RRY FSSN P I St>P VP ALEMMY I WNGY AV IG KQ P KLTDGI LE I ITX 
AEEtfLE KGPENEY S VDDECLVKLLKGLCLKYLGR VQE AEEN FR S 
ISANEKKIKYDKYL I PNALLELALLLMEQDRNEEAI KLhESAKQ 
NYKNYSMESRTHFR I QAATLQAKSSLBNS SRSMVSSVSL 


" 558* 


2619 


915 


LPAGTPESSLHEALDOCMTAXDbFLTNQFSEALSY r LKPRTKESM 
YHSLTYATILEM0Ai1MTFT)PQDILLAGNr4MKEA0MLC0RHRRKS 
SVTDSFSSLVNRPTLGOFTEEEIHAEVCYAKCLLQRAALTFL0D 
EKMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTL.SMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFR5Vl>CVMLLLCYHTFLTFVl/3TGNVI3IEFAEKbLKPYL^ 
YPKGAI FLFLAGR 1 EV1 KGN1DAAI RRFEECCEAQQHWKQFHHM 
CY W ELMVJCFT YKGQVJKMS Y F YADLtSKENCWS XATY I YMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGbKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISbPVPALEMMYIWNGYAVIGKQPKLTDGlLEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLIPNALLEUVLLLMBQDRNEEAIKLLESAKQ 
NYXNYEMESRTHFR10AATL0AKSSLENSSRSMVSSVSL 


5587 


1768 


148 


SSAVPDGAVGRPVAVAVGGPPKSCRCRPCCLKAAIGVKLGCTSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

.1 ops Y" i nr. 

corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alemne, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycjne, 
H-Hi ftt > r i nf> T = 1 c^l pnrinp K=Lv<;ine 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, n =Glutsmine, R^Arginine, 
S=Serine, T*Threcnine, V«Valine, 
W=Tryptophan, Y*= Tyrosine, X=Unknown, *=Sto? I 
Cocon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVY K DGKAG W ANDAGDRVTPAWAY SEKEE I VGLAAKQS HI 
RNISNTVMKVKOlLGR^SSDPQAQKyiAESKCLVIEKNGKLRYE 
I DTGEETKFVK PEDVARL1 FSKMKETAHSVLGSDANDWITVPF 
DFGEKQKNA1GEAARAAG FNVLRLI HEPSAALIAYGIGQDS PFG 
KSN I LV FKLGGTS LSLS VMEVNSGI YRVLSTNTDDNIGG AHFTE 

ANCFLDSLYEGODFDCNVSRARFEljljCSPLFNKCIEAIRGbLDO 
NGFTAE>DINKV\ f LCGGSSRIPKLQQLIKDLFPAVELLNSIPPDE 
VIPIGAAI EAG I L I GKEKLI*VEDS1»MI ECS ARD I L VKGVDESG A 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
fptxf ao v\n .nm^nKKrw^ t.pdt t avltmkr dg<? lhvtctdoet 
GKCEAISIEIAS 


5588 


3 


585 


TPPPPEOAMVAATVAAAWLbLWAAACAQQEODFYDFKAVNIRGK 
LVSLE K Y RGS VS LWNV AS E CG FTDQHYRALQQ L.Q RDLG PHHFN 
VLAFPCl^OFGOOEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKY LACTI SGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQITaLVRKLJLLKREDL 


5589 


1884 


553 


LRQAWHEGG1G0TDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRS P FSSFLPRP I CLSLEARPCS 1 EDRRNWSL.IGRPGAPAS 
GbNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 
ATLGAAGQPLGGES I CS ARAPAKYS I TFTGKWSQTAFPKQYPLF 
R PPAQW.S S buGAAHSSDl SMWRKNQ r Vi wGlaKUi- AfcKCafcAWAijM 
KEIEAAGEALQSVRAVFSAPAVPSGTGQTSAELEVQRRKSLVS? 
WR I VPS PDW FVG VDS bDLCDGDRWREQAALiDLY P YDAGTDS G? 
TFSSPNFAT1PQDTVTEITSSSPSHPANSFYYPRLKALPP1ARV 
TLLRLRQS PRAF1 PPAPVLPSRDNEIVDSASVPETPLDCEVSLW 
SSWGLCGGHCGKLGTKSRTRYVRVQPANNGSPCPELEEEAECVp 
DNCV 


5590 


72 


896 


JuLSSjo hLR b L r AM V AW K b Ar Jj V LLiAr 0 JLAlbvyKijol^JJrUJJr W,L» 

EDAVKETS S VKOP WDHTTTTTTNRPGTTRAPAK P PGSGLDLADA 
IjDDQDDGRRKPGIGGRERWNHVTTTTKRPVTTRAPANTLGKDFD 
UADALDDRNDRDDGRRKPIAGGGGFSDKDLEDIVGGGEYKPDKG 
KGDGRY G SNDDFG SGMV AEPGT I AG VASAL&MAL 1 GAVSS Y I SY 
00KKFCFS I Q0GLNADYWGENLSAWCEEPQVKYSTLHTOSAE 
PPPPPEPARI 


5591 


68 


14 94 


AGS SRRAAAERLLVJSAGCRS LAGRASGVLLLPAELLPGEEEAMA 
LRVTRNSKIKAEKKAKINMAGAKRVPTAPAATSKPGLRPRTALG 
D3GNKVSE0LDAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEFVPEPEPEPEPEPVKEEKLSPEPILVDTASPSPMETSG 
r , aoziPc , ri7 pas Trent/ tt. a iTKrnvnaPTY^inPKTr R WJTDT VA Y7i 
RQLEEEOAVRPKYLLGREVTGNMRAILIDWLVQVQMKFRLLQET 
MY MTVS 2 1DR FMCNNCVP KKMLQLVGVTAMF I AS KY EEMYPP E I 
GDFAFVTDNTYTKKOIROMEMKILRALNFGl^RPLPLHFbRRAS 
KlGEVDVEOHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
IJ^NGEk'TPrLOHYLSYTEESLLPVMQHLAKNAAMVNQGLTKHM 
TVKNKYATS KHAKI STbPQLNS ALVQDLAKAVAKV 


5592 


242 


924 


YGESKDWNOKDLLSALVLTTVNCLPTPIMAKSAEVIOiAIFGRAG 
VGKSALWRrLrKRFIWEYDPTLESTYKHQATIDDEWSMEILD 
TAGQEDT I OREGHMRWGEGFVLVYDITDRGS r EEVLPLKN I LDE 
I KKPKNVTLI LVGNKADLDHSRQVSTEEGEKLATELACAFYECS 
ACTGEGK I ?EI FYELCREVRRRRMVQGKTRRRSSTTHVKOAIKK 
MLTKISE 


5593 


3 


1113 


HASGGRAAN^lAAERGAGOOOSQE^^MEVDRRVESEESGDEEGKKH 
S S G I VAD L S EOSLKDGE ER SEED PEEEHELPVDMET I NLDRDAE 
DVDLNHYR I G K I EGFEVLXKVKTbCLRQNLI KCI ERLEELQSLR 
E1>DI>YDN0IKKIENLEaLTELEILDISFNLLRNIEGVDKLTRLK 
KLFLWK Kl S K I ENLSN bHQLQM LEkGSNR I RAI EN1 DTLTNLE 
SLFLGKNK I TKLQNLDALTNLT VLS MQS NR1>TKI EGLQNLVNLR 
ELYLSHNC I E V 1 EGLENKN KLTM LD I ASNR 1 KXIEN1 SHLTELQ 
EFW4NDNLLESWSDLDELKGARSLETVYLERNPL0KDPQYRRKV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ice 
locatior. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ] 
(A*Alanine, C=Cysteine , D=Aspartic Acid, E= i 
Glutamic Acid, F=Pheny la la nine, G=Glycine, 
H=Histidine, Is I soleucine, K-Lysine, 
L«= Leucine, K=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , RsArginine, 
S»Scrine y T*Threonine , V^Valine, 
W-Tryptcphan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possitale nucleotide insertion) 








HLALPSVRQ 1 DATFVR F 


5594 


3 


1113 


HASGGRAANhlAAERGAGQOQSQEMMEVDRRVESHESGDEEGKKH 
SSGIVADLSEPSLKDGEERGEEDPEE2HELPVDMET3NLDRDAE 
DVDLNHYR 2 GKI EGFEVLKKVKTLCLRQNLl KC I ENLEELOSLR 
ELDLiYDNQ I KK I ENLEALTELE ILDIS FNLLRN I EGVDKLTRLK 
KLFIjVNNK 2 SKI -JNLSNLHQLQMLELGSNR I RA 1 2NIDTJLTNLE 
SliFLGKNKI TKLONLDALTNLTVLSMQSNRLTKI EGLONLVNLR 
ELYbSHNGl E V I EGLENNNKLTMLD I ASNRI KK I HNI SHLTELQ 
EFWMNDNL.LESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPS VRQ3 DATFVRF 


S59S 


3 


1476 


ARVWGRWVQV PAW PGPGCGTNASGERQRQLPRAVJR PVGRTLGSE 
PI ALAW SP PLYLF PI PLPSWAVSQPTPTLGTMFADLDYD 1 EEDK 
LG I PTV PG KVTLQ KDAONIiI GI S 3 GGGAQYCPCLYI VQVFDNTP 
AALDGTVAAGDE I TGVNGR S I XGKTKVEVAKM 1QEVXGEVT I HY 
NKL0ADPKQGMSLDIV1.KKVKHRLVE1N1MSSGTADALGLSRA1LC 
NDGLVKRLEELERTAELYKGMTEHTKNLLRAFyELSO/rHRAFGD 
VPS VI G V RE P0PAA5 EA FVKFADAHRS IEKFG I RliLKT I KPMLT 
DLNTYLNKAI PDTRLT3 KKYLDVKFEYLSYCLKVKEMDDEEYSC 
IAIjGEPLYRVSTGNYEYRIjILRCROEARARFSOMRKDVLEKMSL 

ldokhvodivf0l0rlvstmskyyndcyavlrdadvfp3evdla 
httlayglnoeeftdgeeeeeeedtaagepsrdtrgaagpldkg 

GSWCDS 


B59S 


698 


215 


GAVLAPSSLPAAELAAQGES OS LEDLSNTSRPTSE VYK1 S FI FP 
NGDKYDGDCTRTSSGIYERNGIGIHTTPNGIVYTGSWKDDKJWG 
FGRLEHFSGAVYEGOFKDNMFHGLGTYTFPNGAKYTGNFNENRV 
KGEGEYrHICGTRMDWTFHFTSCSQT 


5597 


3 


733 

I 


I SCKMAADGOS SLPASKRS VTLTHVEYPAGDLSGHLLA YLSLSP 
VFVIVGFVTLI IFKRELHTISFLGGLALNEGVNWLI KNV1QEPR 
PCGGPHTAVGTKYGMPSSHSQFtWFFSVYSFLFLYLRMHQTNNA 
RFLDLLW RHVIiSLGLLAVAFLVS YSRVYLLYHTWSQVLYGG I AG 
GLMAIAWF1FTQEVLTPLFPRIAAKPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKIiQ 


559B 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL, 
VPLLGSGVPFKPP APS PCCSGQTMLKMLS FKLLbLAVALGFFEG 
DAKFGERWEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTNNTECGKLLEE 
IKCALCSPKSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLOTTADEFCFYYARKPGGLCFPDFPRKQVRGPASNYLD 
OMEEYDKVEEI SR KHKHNCFC3 QEWS GLRQPVGALH SGDG SQR 
LFILEKEGYVKlLTPEGEIFKEPYliDIHJCbVOSGIKGGDERGLL 
SLAraPNYKKNGKLYVSYTTNQERVJAIGPHDHILRVVEYTVSRK 
NPHQVDLRTAR VFLEVAELHR KHLGGQLLFGPDGFLY I ILGDGM 
ITLDDMEEMDGLSDFrGSVLRLDVDTDMClJVPYSIPRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARIIX?I I KGKDYESEPSLLEFKPFSNGPLVGGFVYRGCOSERL 
yGSYVFGDRNGNFLTLQQSPVTKOWQEKPLCLGTSGSCRGYFSG 
HILGFGEDEU3E VYI LSSS KSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAOTIiTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5599 


326 


2440 


GIGPI AAS FIFCKVASLYI FLS PPPPS VSGVPYSPANS SWSCAL 

VPLLGSGV P PHP PAPS PCCSGQTMLKMJjS FKLLLLAVALGFFEG 

DAKFGERWEGSGARRRRCLNGN P PKRLKRRDRRKMSQLELLSGG 

EMLCGGFY PRLSCCLRSDSPGLGRLENKI FSVTNNTECGKLLEE 

IKCALCSPHSCSLFHSPEREVLERDbVLPLLCKDYCKEFFYTCR 

rut rv'- , T?i rvTTiinccfCWMs irr>rpi ftJOrkWoo vmnjftoiVQ'NVT n 
uniycirljyi 1 AJJr.H-r i lAKWJbljLis^rrUr rKlWVJKVjrJ*c>ril 1*1/ 

QMEEYDKVEEISRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 

LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGD2RGLL 

SLAFHPNY KKNGKLYVS YTTNQERWAIGPHDHILRWEYTVSRK 

NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYIILGDGM 

ITLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYSIPRSNPHFKS 

TNQP PEVFAHGLHDPGR CAVDRH PTDINIWLT I LCS DSUGKNRS 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/5331? 



PO7US00/34263 



SEQ 
ID 
NO: 


Preri; ctec 
becinninc 
nucleotide 
location 
corresponding 
to f-rst 
amine acic 
resicue of 
amine acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
. amino acid 
residue of 
amino acid 
sequence 


Amino scad secment containing signal peptide 
<A=Alanine, (^Cysteine, D^Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histid:ne, I=Isoleucine, K= Lysine, 
LsLeucine, M=Methionine, NkAsparagine , 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
Ws Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








SAR I LQI 2 KGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCW3TSGSCRGYFSG 
K 1 LGFGEDE LGE VY 1 LSS S KSMTQTHNG KLYKI VDPKR PLMPEE 
CRATVQPA0TLTSECSRLCRM3YCTPTGKCCCSPGWEGDFCRTG 


5600 


2577 


1244 


SLRVLSGKLHQTRDLVOPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCF2GMKPVNQTAASNKGLRGLLHPQ0LHLLSR0LEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHFNCKYDAKCTXPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKW1RPQTSE 


5601 


2977 


1244 


SLR VLSGHLMQ7RDLVQPDKPAS PKFI VTLDG VPS PPGYMS DQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLKLLSROLEDPNGSF 
SKAEMSELSVAQKPEKLLERCKYWPACKiaGDECAYHHPISPCKA 
FFNCKFAEKCLFVHPNCKYDAKCTKPDCPFTKVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKKECPFYHPKKCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5602 




766 


YIJTSCTVWRTAKEALENTEVPVGCLMVyNNEWGKGRNEVNQTK 
NATRHAEMVA I DQVLDWCRQSGKS PSE V FEHTVLYVTVEPCI MC 
AAALRLMKIFLWYGCQNERFGGCGSVL»NIASADLPNTGRPFQC 
I PGYRAEEAVE^5LKTFy KQENPNAPKS K*vn?KKECQ0I LNMF 


5603 


1 


565 


FRGR TP I SGGERGCAQY P 3 PATPARSGENRTM PGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVS VGKATP I YAVNGTEI LbPCTFSS 
CFGFEDLHFRWTYNSSDAFKI LIEGTVKNEKSDP KVTLKDDDR I 
TLVGSTKEKRNN1SIVLRDLEFSDTGKYTCHVKNPKENNL0HHA 
TIFLQWDRRMQ 


5604 


1 


1S06 


EDI F PAQLLKLQRHERVWQQE P PVRDHR S WGGSG AGGVAGREWT 
D0GOVALGGHYMAEGEGYFAMSEDELACSPY1PLGGDFGGGDFG 
GGDFGGGDFGGGDFGGGGSFGGHCXDYCESPTAHCNVLNWEQVQ 
RLDGILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKRIGVR 
DVRU4GSAASHVLHQDSGLGYKDLDLI F'CADLRGEGEFQTVKDV 
VLDCLliDFLPEGVKKEKITPLTIiKEAY-VCKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFOIKLDSLLLFYECSE 
NPMTETFHPTIIGESVyGDFQEAFDHLCrJKIIATRNPEEIRGGG 
LLKYCNLLVRGFRPASDEI KTi,QRYMCSRFFI DFSDI GEQQRKL 
ESYLQNHFVGLEDRKYEYLMTLHGWNESTVCLMGHERROTLNL 
ITMLAIRVLADQNVIPNVANVTCYyQPAPYVADANFSNYYIAQV 
QPVFTCQQQTYSTWIiPCN 


S60S 


25 


1621 


S0RS CPRSPS S PAPPW ARCS^ PDSRTGGVPVPRAWSAGGPALGL. 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRRYPLPLRSGKEAKILQHFGEGLCRMLDERLORHRTSG 
GDHAPDSPSGENSPAPQGRLAEVQDSSKPVPAQPKAGGSGSYWP 
ARHSGARVILLVLYREHLNPKGHHFLTKEELLORCAOKSPRVAP 
GSARPWPALRSLLHRNLVLRTHQPARYSLTPEGLELAQKLAESE 
GLSLLNVG I G PKEP PGEETAVPGAASAELASEAGVQOQPIiELRP 
G E YRV LLCVD I GETRGGGHRPELLREL<)RLHVTHTVRKLHVGDF 
VVTVAQETNPRDPANPGELVLDHIVERKRLDDLCSSIIDGRFREQ 
K FRL KR CGLERR VYLVEEHGS VHNLS LP ES TLLOA VTNTQVI DG 
FFVKRTAD1 KESAAYLALLTRGLQRLYCGHTLR SRPKGTPGNPE 
SGAIJJT3PNPLCSLLTFSDFNAGAIKNKA0SVREVFARQLM0VRG 
VSGEKAAALVDRYSTPASLIAAYDACATFXEQETLLSTIKCGRL 
QRNLGPALSRTLSQLYCSYGPLT 


5606 




1099 


GR 5 R C ?G PG ARGGTMS PRS CLRSLRLLV F AVF S AAASNWLYLAK 
LSSVGSISEEETCEKLKGLIOROVQMCKRKLEVMPSVRRGAQLA 
I EECQYOFRNRRWNCSTLDSLPVFGKWTQGTR EAAFVYA I SS A 
GVAFAVTRACSSGELEKCGCDRTVHGVSPQGFOWSGCSDNIAYG 
VAFSOSFVDVRERSKGASSSRALMNLHNNEAGRKAILTHMRVEC 
KCHGVSGSCEVKTCWRAVPPFRQVGHALKEKFDGATEVEPRRVG 
SSRAJjVPRNAQFKPHTDEDLVYLEPSPDFCEQDMRSGVLGTRGR 
7CNKTSKAlDGCELLCCGRGFHTAQVEL?iERCSCKFHWCCFVKC 
RCGQRLVELHTCR 
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BNSDOCID. <WO 01S3312A1_1_> 



WO 01/53312 



PCT/USIW34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue cf 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amno acid segment containing signal peptide 
(A-Alanine, OCysteine, DsAsparCic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
K-Histidine, I=Isoleucine, K=bysine, 
L= leucine, (^Methionine, N=Asparagine , 
P= Proline, O-Glutamine, FUArginine, 
S=Serme, T»Threonine, V=Voline, 
W^Tryptophan, Y-Tyrosine, X=Unknown, *=stop 
Cccon, /^possible nucleotide deletion, 
\~possibie nucleotide insertion) 


5607 


52: 


141 


PPVCNPAEAI-IPSPGTVCSLL^UWT.WLDIAMAGSSFLSPEHORV 
QQR KBSKK PPAKLQPRALAGWLRFEDGC-OAEGAEDELEVRFNAP 
FDVGIKLSGVQYQQHSOALGKFLODILKEEAKEAFADK 


5608 




983 


WF0SPLR0ADPGPPRHTLFMDFVAGA1GGVCGDAVGYPLPTVKV 
RIOTEPKYTGIWHCVRDTYHRERVWGFYRGLLijPVCTVSLVSSE 
VFGTYRHCLAHICRLRFGNPDAKPTKAD1TLSGCASGLVRVFLT 
SPTEVAXVRLQTQTQAQKQQRRLSASGPLAVPPMCPVPPACPEP 
KY R G P LHCLATVAR E EGLCGLY KGS S ALVLR DGHS FATYFLS YA 
VLCEWLSPAGHSRPDVPGVLVAGGCAGVLAWAVATPMDVIKSRL 
OADGOGQRRYRGLLHCMVTIVREEGPRVLFKGbVLNCCRAFPVN 
MW FVAYE AVLRLARGliLT 


5609 


1626 


304 


AKG V WVLP £ PP PRPGRGALVSGSGLR RGKS GTS WRFRRMNHKS K 
KRZREAKRSARPELKDSLDWTRHNYYESFSLSPAAVADNVERAD 
ALObSVEEFVERYERPYKPWLLNAOEGWSAOEKWTLERLXRKy 
RNQKFKCGEDNDGYS VKMKMKYY2 E YMESTRDDSPLYI FDSSYG 
EHPKRRKLLEDYKVFKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGI HI DPLGTSAWNALVQGHKRWCLFPTSTPREbl KVTRDEGG 
KOCJDEAITWFNVI YFRTOLPTWPPEFKPLE 1 LQKPGETVFVPGG 
WWJiWLNLDTTIAITQNFASSTNFPV^v/Vv'HKTVRGRPKIiSRKWYR 
IliKQEHPELAVuADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSODDCVSKERSSS 
R 


5610 


54 


1196 


LER TPAS ADMA WTK YQL FLAGLM LVTG S I KTLS A X WADNFMAEG 
CGGSKEHSFOHPFLOAVGMFUSEFSCLAAFYLbRCRAAGQSDSS 
WPQQPFI^PLLFLPPALCDMTGTSLMYVAIjNWTSASSFQMLRGA 
VI 3 F?GLFSVAFLGRRLVLSQWLGI LATIAGLVWGLADLLSKH 
DSOH KLSEVI TGDLLI 2 MAQI IVAI QMVLEEKFVYKHNVHPLRA 
VGTEGLFGFV3 LSbLLVPMYYI PAGSF SGNFRGTLEDALDAFCQ 
VGOOPLI AVALLGNI SS 1 AFFNFAG1 SVTKELSATTRMVLDSLR 
TW 1 WALSLALG WEAFHALQI LGFL I LL I GTAL YNGLHRPLLGR 
LSKGRPLAEESEQERLLGGTRTPINDAS 


5611 

i 


2 


57-? 


FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPALAMSG 
ELSNRF0GGKAFGLLKARQERRLAE3NREFLCD0KYSDEEWLPE 
KLTAFKEKYME FDLNNEGEI DLMSLKRMKEKLGVPKTHLEMKKM 
I SEVTGGVSDT I S YRDFVNMMLGKRSAVLKLVMMFEGKANESS P 
KPVGPPP5RDIASLP 


5612 


1 


721 


ASRDGYMDATI APHRI PPEMPQYGEENHI FELMQAMWIiCKHLNS 
SLLTLENLILN-EFSYTATEARRLYLQRKTVPSALLVQLIQERLA 
EEDC3 KQGW1LDGI PETREQALRIQTLGITPRHV JVLSAPDTVL 
1 ERKLGKRIDPQTGEI YHTTFDWPPESE1 QNRLMV PEDI SELET 
AOKLLEYHRNIVRVIPSYPKIbKVJSADCPCVDVFYOALTYVQS 
NHRTNAPFTPRVLLLGPVGS 


5613 


115 


1279 


RGVDPALRRAEXMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRNL FFFLCLNLS FAFVE LL YG I WSNCLGLI S DS FHMFFDST 
A I LAG LAAS VI S K W R D NDAFS YG YVRAE V LAG FVNG L FL I P TAF 
FI FSEG^PJ^APPDVHHERLLLVSILGFVVNLIGI FVFKKGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHC-HFHSHDGPSLKETTGPSRQIL0GVFLH1LADTLGSIGVI 
AS AI KMQNFGLM1 ADP ICS I L IAI LI WS V I PLLRES VG I LMQR 
TPPLLBNSLPOCYQRVQQLQGVYSLQE0KFWTLCSDVYVGTLKL 
I VAPDADARWILSQTHNI FTQAG VRQLYVQ IDFAAM 


5614 


3 


1268 


LLSRigEHACPLQAGLGLTCRKPKAIRGREGRATNQGQGETONER 
APWGARQRLGVMAELQQLQEFEIPTGREALRGNHSALLRVADYC 

rnM\A7AIlTriVDW71T I?TrTMaC , T'7V^JVT.2iQ\;ZiVri\7r*KIT JVOWTT DMT T» 
tjUyi I v\jf\l JJiVKi^/iljCiti I rlHc 1 iyALfti \JnI y VWNJjHvxTI L LiKrllxU 

LQGAALRQVEARVSTLGQMVNMHMEKVARREIGTLATVQRLPPG 
QKV1 AFENLPPLTPYCRRPLNFGCLDD3 GHGI KDLSTQLSRTGT 
LSRKS 1 KAPATPASATLGRP PRI PEPVKLPWPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPFPLDGDELGLPPPPPGF 
GPDE PSWVPAS YLE KWTLYPYTSQKDN E LS FS EGT V I CVTRRY 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00734263 



SEC 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seoment containing signal peptide 
(A=Alanine, C*=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Hiscidine, l=Iscleucine, K=Lysine, 
L=Leucine, K-Methionine, N^Asparagine, 
P=Proline, G>Glutamine, R=Arginine, 
S*Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X-Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDGWCEGVSSEGTGFFPGNYVEPSC 


5615 


9 


1558 


ALGRRR PGDPREMEAAATPAAAGAAKREELDMDVMRPL I NEONF 
DGTS DE E H EQELL P VQKH YQLDDQ EGI S F VQTLMKL LKGNIGTG 
LLGLPLA I KWAG3VLGPISLVF I G IISVHCMHI LVRCSHFLCLR 
FKKSTLG YSDTVS FAMSVS PWSCLQKQAAWGRS WDFFLVITQL 
GFCSVY I VFLAENVKQVHEGFLESKVFI SNSTNSSNPCERRSVD 
UUYMLCFLPFI I LLVFIRELIO^FVLSFLANVSMAVSLVI IYQ 
YWRNKPDPHNLP J VAGWKKYPLFFGTAVFAFEGIG WLPLENQ 
MKESKRFP0Al»NIG^5GIVTrLYVTLATU;yMCFHDEIKGSITLK 
LPQDVWLYQS VKILYS r GI FVT Y S I QFYVPAE 1 1 1 PGITS KFHT 
KWK0ICEFGIRSFLVS1TCAGA1LIPRLDIVISFVGAVSSSTLA 
L I L? PLVE ILTFSKEHYNI WMVLKN I S I A FTG WG PLLGTY ITV 
EE 1 1 YPTPKWAGTPQS PFLNLNSTCLTSGLX 


S616 


1 


715 


DDFVRCGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
LKHIiSSGDLliRDNMLRGTEIGVLAKAFIDOGKLIPDDVMTRLAL 
HELKNLTOYSWLLDGFPRTLPQAEALDRAYQIDTVINLN^/PFEV 
2KQRLTARWIHPASGRVYNIEFNPPKTVGIDDLTGEPLI0REDD 
KPETVI KRLKAYEDOTKPVLEYYOKKGVLETFSGTETNKI WPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAETTVSSVG 
TCEAAGKSPEPXDYDSTCVFCRIAGRQDPGTELLHCENEDIiICF 
KD3 XPAATHHYIjWPKKHI gncrtlrxdqvelvenmvtvgxtil 
ERNNFTDFTNVRMGFKMPP FCS 1 SHbHLHVLAPVDQLGFLSKLV 
YRVNSYWF1TADHLIEXLRT 


5618 


3 


1692 


YI>NYINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKS IRLLSEI EKLVGTSVPGLLE2 ILSSS 1LEI YN 
KILQTWPDEDVTFRKSCATKRKLSNINQEEASGTSLHOKAIMT 
FTCHNEINAFWLSRGSOILSLKSTRFLTKLGHCSSACPSDSVS 
gTNIQNbKGLNSPVLlGKSKDPSCVAKVSEEGKPAIGTOKMELH 
VRWRSDTGKCVDA5 PLWI PTFDKSSTTVY1 GSHSHRMKAVDFY 
SGKVKWEOIIiGDRIESSACVSKCGNFIWGCYNGLVYVLKSNSG 
EKYWMFTTEDAVKSSATMDPTTGLI YIGSHDQHAYALDI YRKKC 
VWKSKCGGTVFSSPCLNLIPHHLYFATLGGLLLAVNPATGNVIW 
KHSCGKPLFSSPQCCSQYICIGCVIX3NLLCFTHFGEQVW0FSTS 
GPI FSSPCTSPSEQKI FFGSHDCF 1 YCCNMKGHLQWKFETTSRV 
YATPFAFHNYNGS}?EMLLAAASTDGKVWILESOSGQLQSVYEIiP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQX 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSU;SPPGAGRGCPCP 
AQSLHSHQLAAVJDPLXPSLRSYPPHLLOHPOtRSLTASSGHLGR 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGS PSSPGCCLHPPAOH SQDLPLVHVDVGWOPPLGP 
TVGLRPGLLGERQRGALRAGDPOCOCPLPATVREDLGVPSPWAA 
ECSPPATP 


5620 


930 


102 


PLPF PTLAMFLTRSEYDRGVNTFSFEGRLFOVEYAI EAI KLGST • 
A I GIQTSEGVCIiAVEKR ITS PLMEPSS I EKI VEI DAHIGCAMSG 
LIADAKTLIDXARVETQNHWFTYNETMTVESVTOAVSNLALQFG 
E EDADPGAMSR PFG VALLFGG VDEKG PQLFHMDPS GTFVQCDAR 
A I GSAS EG AQSS LOEVYHKSMTLKEA 3 KS S L»I I LKQVMEEKLNA 
TNI ELAT VQPGQNFHMFTKEE LEEVI KDI 


5621 


3 


B19 


WEFVE YTATDAN VKNESLS SVQQLG I XMTVR YG KFLS LLKDGA 
ENDLTWVbKHCERFLKQQOTSIKSSLLCLQGNYAGHDWFVSSLF 
M I MLGDKE KTFQFLHOFSRLLTS A FL WLPR LHISS YLPWDTVES 
GIHPVYFCSTHY1EMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIEICHYIATCVFLGPDYQVYICIAVFKHLQODILQ 
HTQTQDLQVFLKEEALHGFRVSDYFEYMEILEONYRTVLLRDMR 
NIRbQST 


5622 


1122 


4 56 


AASTKDAVSRKRSHSASEKSGTGTSISXRLNHNP<21RNPMKAMY 
PGTF YFOF KNLWE ANDKMETWLCFTVEG I KR R S WS W KTG VFRN 
QVDSETHCHAERCFUSW FCDDI hS PNTKYQVTWYTS WSPCPDCA 
GEVAEFLARHSNVNLTIFTARLYYrOYPCYOEGLRSLSOEGVAV 
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BNSDOC10: <WO 0153312A1 J_> 



WO 01/53312 



PCT/USOO/34263 



SEC 
ID 
NO: 

I 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secment containing signal peptide 
(A^Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycin£., 
H=Histidine, I=lsoleuclne, K=Lysine, 
L=Leucine, M=Kethionine, W=Asparagine , 
P=Prcline, Q=Glutamine, K=Arginine, 
SxSerine, T=Threonine, V=Valine, 
W= Tryptophan, Y~Tyrosine, X*=Unknown, *=Stcp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


1 






EIPmYEDFKYCWENFraDNEPFKPWXGLKTNFRLLKRRLRESL 
0 


! 5623 

i 

i 

i 


3 




FLPFFIRAPKIS RNGQ WLFTFT TP FP FANKALPG WEG IVPACFW 
RKK I LTPSTGTMELLQVTI LFLLPS I CSSNS TGVLEAANNSLW 
TTTKPSITTPNTESLOKNWTPTTGTTPKGTITNSLLKMSLMST 
ATFLTSKDEGLKATTTDVRKNDS2 1 SNVTVTSVTLPNAVSTLQS 
S K ? KTSTQS S I KTTE I PG S VLOPD AS P S KTGTLTSI PVT I PENT 
SQSQV I GTEGGKKASTSATSRSYSS 1 1 LPW1 ALI VITLSVFVL 
VGLY RMC WKADPGTPEN GNDQPOS DKE SVKLLTV KTISHESGEH 
SAOGKTKN 


j 5S24 

i 

1 


1S9 


896 


PGVAAAAGALPO YHG PAPALVS CK P. ELSLSAGSLQLERKRRDFT 
SSGSRKLYFDTHALVCLLEDNGFATOQAEIIVSALVKILEANMD 
IVY KDMVTKMQOE I TFQQ VMSQI ANVKXDM 1 1 LEKS EF5ALRAE 
NEK1KLELHQLKQQVWDEVIKVRTDTKLDFKLEKSRVKELYSLN 
EKKLliELRTEIVALHAQQDRALTOTDRKIETEVAGLKTMLESHK 
LDN3 KYLAGS1 FTCLTVALGFYRLW I 


5625 

1 

I 

\ 
i 

i 


1 


1180 


TI PS S AAACRAG PPAGALE ALSPGGARAHAERRGEMRATPLAAP 

L^CT.QPVtfPT.Pl.'nnMT nTPOPVnKEXvP^ftPrvC'PT.PPr'T.T.PT.CiPP 

TAPDRATAVATASRLGPYVLL5PEEGGRAYQALHCPTGTEYTCR 
VYP VQEALAVLE P YARLP PHKHVAR PTE VLAGTQLLYAFFTRTH 
GDMKSLVRSRHR 1 PEPEAAVJbFRQKATALAHCHQHGLVLRDIjXLi 
CRFVFADRERKKLVLENLEDSCVLTGPDDSJLWDKHACPAYVGPE 
ILSSRASYSGKAADVWSLGVALPJ'MLAGHYPFODSEPVLLFGKI 
RRG A Y AL PAG LS A PARC LVRCLLR R E P A ERLTATG I LLHP WLRQ 
DPMPLAPTRSHLWEAAOWPDGLGLDEAREEEGDREWLYG 


• 5626 

! 

i 
i 


3123 


2011 


PPRALGSVAMENQVI.TPHVYVIAORHREUYLRVELSDVONPAISI 
1 t»N Vijnr rJ^\^nK3j\r^siJr* V it»t nutLr iiisLtv r*Jrt.Fv 1 R.1j1\ < /\v v W 
I TVQK KVS QWWE R LTK.QE KR PLFLAF D FDRWLDESDAEME LRAK 
EEERLN KLRLES EG S P ETLTNLRKG Y LFM YNLVQFLG FS W I FVN 
LTVR FC I LGKES FYDTFKT VADMM Y FCQMLAWET I NAAIGVTT 
SPVLPSLIQLLGRNFILF1I FGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTCI DMDWKVLTWLRY TLW I PLYPLGCLAEAVS VI Q 
SIPIFKETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFbGLYINF 
RHLYKQRRRRYGQKKKKJH 


5627 


3123 


2011 


PPRALGSVAMEKOVLTPHVYWAORHREbYLRVELSDVONPAI S I 

TPW\n ,UP»f AOf^Hrta. V , rinMVVFPMT,FPT,nT.VK'PFP\7YKT.TnPnVN 
i ZiV* v un r t<n\^^ri\3nT\\3iJv* v i c r nuur xjuuv i\.tr e#jt v i ivu a. \jt\\j v n 

ITVOKKVSQWWERLTKQEKRPLFIiAPDFDRWLDESDAEMELRAK 

EEERlNKliRLESEGS PETLTNIiR XGYLFMYNLVOFLGFS WI FVU 

LTVR FC I LG KES FY DTPHTVADMHY FCQMLAWETINAAI GVTT 

SPVLPSLIOLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 

IFRYSFYMLTCIDKDWKVLTWLRYTLWIPLYPLGCLAEAVSVIQ 

S I ? I FNETGRFS FTLP YP VK I K VR FS FFL Q I YI»I MI FLGL Y I NF 

RHLYKORRRRYGQKKKKIH 


5628 


75 


1455 


VAGAMAS KCLKAGF SSGSLKS PGGASGGS TRVSAMY S SS P CKL P 
SLS P VARS FS ACSVGLGRS SYRATSCUPALCLPAGGFATS YSGG 
GGWFGEGI LTGNEKETMOS LNDRLAGYLEKVROLEQENASLESR 
2REWCEQQVPYMCPDYQSYFRTIEELQKKTLCSKAENARLWE1 
DNAKLAADDFRTKYETEVSLRQLVESDIKGLRRILDDIjTLCKSP 
LEAOVE S LKEE LLCLKKNH EE EVNS !-R CQLGDRLNVE VDAAPF V 
DLNR^EEMRCQYEJTLVENTORDAEDWLDTQSEELNOQVVS S SE 
QLQSC0AEIIELRRTVNALEIELOA0HSKRDALESTLAETEARY 
SSQLAQKOCMI TNVEAQLAEI RADLERQNQE YQVLLDVRARLEC •« 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCSARPICVPCPGGRF 


S629 


2287 


938 


GRPRSSSDNRKFLRERAGLSSAAV0TR1GNSAASRRSPAARPPV 
PAP PALPRGRPGTEGSTS LSAPAVLVVAVAVVWWSAVAWAMA 
NYIHVPPGSPEVP KLNVTVQDQEE HR CREGALS LLQHLR PH WDP 
QEVTLOLFTDG I TNKIilGCY VGNTMED WLVRI YGNKTELLVDR 
DEEVKSFRVLQAHGCAPOLYCTFNNGLCYEFIQGEALDPKHVCN 
PAIFRLI ARQLAX I HAI HAHNGWI PKSWLWLKMGK YFSLI PTGF 
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BNSDOCID: <WO 01 533 1: MJ_> 



WO 01/53312 



PCT/US<lO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ot 
amino acid 
sequence 


Predicted end 
nvc2 eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l-lsoleucine, K=Lysine, 
Jj=Leucine, M=Kethionine, N=Asparagine , 
P=Proline, Q=Gj ut amine , R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
W=Tryptophan, Y=?yrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=poesible nucleotide insertion) 








ADEDINKkFLSDIPSS0IU3EEMTWMi<XILSOTiGSPWLCrTOL 
LCKNI I YKEKQGDVC F I D YEYSG YNYLAYDI GNHFNE FAG VSDV 
DYSLYPDRELQSQWLRAYLEAY KEFXGFGTBVTEKEVE 1 LFIQV 
NQ FALASKF FWGLWALI Q AKYST I EFDFLG YAI VR FNOY F KMKP 
EVTALKVPE 


5630 


1194 


278 


GFWAI AQTCAHHLPPG S PWLVPAS PWRLPEMSS FGYRTLTVALF 
TLICCPGSDEKVFEVHVRPKKLAVEPXGSLEVNCSTTCNOPEVG 
GLETSLDK1LLDEQA0WKHYLVSNISHDTVLQCHFTCSGK0ESM 
KSKVS VY QPPRQVI LT LQ PTLVAVGKS FT I ECRVPTVE PLDSLT 
LFLFRGNETLHYETFGKAAPAPQEATATFNSTADREDGKRNFSC 
LAVLDLMSRGGNI FKKKSAPKMLE I YE P VSDSQMVX JVTWSVL 
LSLFVTSVLLCFIFGCKLROQRMGTYGVRAAWRRLPOAFRP 


5631 


1052 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARKAATMSVFGKLFGAGGGK 
AG KGGPTPQEA IQR LP DTEEMLSKKQE FLEKKI EQELTAAKKHG 
TKNVCRAALOALKRKKK YEKOIAQI DGTL.ST1 EFOREALENANTN 
TEVLKNMGYMKAMKAAHDNMDIDKVDELMQDIADQQELAEEIS 
TA1 SKPVGFGEEFDEDELMAELEELEQEELDKNLLEI SGPETVP 
LPWPSIALPSKPAKKKEEEDDDKKELENWAGSM 


5632 


3 


952 


WLGWS P PR RLWWGS LrGAAQR PAVPVSGLARS LHVETR R PH RRA 
SVRVARGRbGVWAOPOPLLPRPVGSRREMQPPGPPPAYAPTNGD 
FT FVS SADAEDLSGS 1 ASPDVKLNLGGDFI KESTATTFLRORGY 
GWLLEVEDDDPEDNKPLLEELDICLKDIYYK1RCVLMPMPSLGF 
NRQWRDNPDFWGPLAWLFFSMISLYGQFRWSWI ITI W3 FGS 
LTIFLLARVLGGEVAYGQVLGVIGYSLLPLIVIAPVLLWGSFE 
WSTL I KLFGVFWAAY S AASLLVGEEF KTKKPLL 2 Y P I FLLY 1 Y 
FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEOLLSSHLHQVPFFCCFTWCLCN 
CLFENSVSXLYMLCFK FFMSI FFYSLS I TKLML1 YLWGLSYOSL 
LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRIRSRAAASRPxAGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLS S R FRRVD I DE FDENK F VDEQE EAAAAAAEPGP DP S E VD 
GLLROGDMLRAFHAALRNS P VNTKNQAVKERAQG WLKVLTN FK 
SSEI EQAVOSLDRNG VDliLMK Y I YKGFEXPTENSSAVLL»OWHEK 
ALAVGGLGS I 1 RVLTAR KTV 


5635 


3 


943 


DRG PR S TATDTGRAR VS FWRFPLDPGVKNSNVQI SGEKRRFR7L 
RS LFHPFPVTRSGAPR>i VLVGS S W PAXJM VAPAVKVARG W SG LAL 
GVRRAVLQIiPGLTQVRWSRYSPEFKDPLIDKEYYRKPVSELTEE 
EKYVRELKKTQLIKAAPAGKTSS VFEDPVI SKFTNMMM3 GGNKV 
LARSIMIQTLEAVKRKQFEKYHAASAEEQATIERNPYTI FHOAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 
KKHQRTLMPEKbSHKiLEAFHNOGPVIKRKHDLHKMAEANRAJLA 
HYRWW 


5636 


2253 


1143 


LEDTI CQH P p AEKKLYL YH R KLRE V ERNGI PRLPKD VFMDTHQG 
LTDVRAKVTGFSEGWDS VKGGFSS FSQATHSAAGAWS KPREI 
ASL1RNKFGSADNIPNLKDSLEEG0VDDAGKAI»GVISNF0S£PK 
YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTIJMQSSGFD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEBOLNDLTELHOKEILNLKQEIASMEEKIAYQSYERARDI 
QEALEACQTK I S KMELO QQQCQ WQLEGLENATAKNLLG K L I N I 
LlAVMA\O.LVFVSTVANCWPLMKTRNRTFSTLFLVVFIAFLWK 
HWDALFS YVERFFSS PR 


5637 


946 


2532 


MSFCG ARANAKMMAA YTv GGTS AAAAGHHHHHHHHLPHLP P PHLK 
HHHH PQHHLH PGS AAAVK PVQCHTS S AAAAAAAAAAAAAMLNPG 
QQOPYFPSPAPGQAPGPAAAAPAQVQAAAAATVKAHHHQHSHHP 
QQOLDIEPDRPIGYGAFGWWSVTDPRDGKRVALKKKPNVFONL 
VSCXRVFRELKMLCFFKH DNVLSALDILQPPHI DYFEE I YWTE 
LMQS DLHK 1 1 VS PQ P L»S S DHVKVFLYQ I LRGLKYLHSAG I LH RD 
I KPGNLLVNSNCVLKI CDFGLARVEELDESRHMTQEWTOYYRA 
PEILMGSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIOOLDL 
ITDLLGTPSZjEAMRTACEGAKAHILRGPHKQPSLPVLYTLSSQA 
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BNSDOCID: *WO 0153312A1 J_> 



WO 01/53312 



PCT/USOO/34263 



SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E* 
Glutamic Acic, F- Phenyl alanine , G-Giycine, 
H=Histidine, 3=Isoleucine, K*Lysir.L, 
L»Leucine, M=Methionine, NsAsparegint , 
P-Proline, 0=Glutamine / R=Arginirje, 
S=Serine, T- Threonine, v=valine, 
W=Tryptophan. Y-Tyrosine, X=Unkncwn, *=stop 
Codon, /^possible nucleotide deieticr., 
\=possible nucleotide insertion) 








THEA VHLLCR MLVFDPY XRI SAKDALAH PY L DEG RLR YHTCKCK 
CCFSTSTGRVYTSDFEPVTNPKFDDTPEKKLSSVROVKEIIHOF 
ILE00KGNRVPLCINPQSAAFKSFISSTVAOPSEMPFSPLVWE 


563B 


125 


1155 


DRKMSELDOLRCEAEOLKMQIRDARKACADATLSOITNKIDPVG 
R I QMRTRRTLHGHLAKI YAMHWGTDS3LLVS ASQDGKL 1 1 WDSY 
TTNKVHAI PLRSSi-TVMTCAYAPSGNYVACGGLDN ICS I YNLKTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQ?TTFTGHTGr)^SLSLAPDTRLFVSGACDASAKLWDVREGK 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
KTYSHDNIICGIT£VSFSKSGRI»LLAGYDDFNCi4VWDAl>KADRA 
GVLAGHDNRVSCbGVTDDGMAVATGSWDSFLKjWN 


5639 


125- 


1155 


DRKMSELDQLROEAEOLKNQIRDARKACADATLSOI'nvTNIDPVG 
R I QMRTRRTLRG KLAK I YAMHWGTDSRLLVS A S C DGKL I I WDSY 
TTNKVKAI PLRSSKVMTCAYAPSGNY VACGGLDK1 CSIYNLKTR 
EGNVRVSRELAGKTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFV3GACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRIjFDLRADOEL 
MTYSHDNJ ICGI TSVSFSKSGRLLLAG YDDFUCf^DALKADKA 
GVlAGHDNRVSCbGVTDDGMAVATGSWDSFLK^W 


5640 


280 


1092 


ck)gn kktmlsh ntmmkqrkqqata 3 mx2 vhgn dv d g mdlgk kvs 
iprdimleelshlsktrgarlfkkrqrrsdkytfenfoyosraqi 

NHS I AMQNGKVDGSNTjEGGSQQAPLTPPNTPDPRS PPNPDNIAP 
GYSGPLKEIPPEK?NTTAVPKYYOSPWBQAlSKD?SLLEAliYPK 
IjFKPEGXAELPDYRSFNRVATPFGGFEKASRMVKFKVFDFELLL 
LTDPRFMSFVNPLSGRRSFNRTPKGV7ISEN1PiYITTEPTDDTT 

vpesedl 


5641 


27 


332 


crhncngdvkllsmomdklfafhlftfkgllkfldgsiokliqa . 

eiilsdnssilvlenwlfkvkskqfihliakkfyisitivsas 

ngesfvlsmivtg 


5642 


195 


1247 


itpcrmdflvlflfylasvlmglvlicvcskthslkglarggao 
i fs ci : peclqramhgllhylfhtrnhtfivlklvlogmvyte Y 

TWEVFGYCOELELSbHYLLLPYLLljGVNLFFFTbTCGTNPGlIT 
KAKELLFLHVYEFDEVMFPKNVRCSTCDIjRKPAKS}GJCSVCNWC 

vhrpdhhcvwvnnci gawn iryfli yvltltas aat VA I VS TTF 
lvhlwmsdlyoevyiddlghlhvkdtvfliqylfltfprivfm 
lgfvv\tlsflu?gyllfvlyi^tnottnewyrgdwawcqrcpl 
vawppsaepqvhrn i hshglrsnlqei flpafpch er kkce 


5643 


3 


B47 


psggvr dvetrg pg s raarg prwkkrrgvgag A i ak k klaeak 

YKERGTVliAEDQIAOMSKQLDMFKTKLEEFASKHKOEIRKNPEF 
RVQFQDMCATIGVDPLASGKGFWSEMI.GVGDFYyEi.GVOI I EVC 
bALKHRNGGLITLEELHQQVLKGRGKFAQDVSCDDLIRAIKKLK 
ALGTGFGIIPVGGTYLIQSVPAEU^MDHTVVLC1AEK>3GYVTVS 
EIKASLKWETER7iR0VLEHLLKEGLAWLDLQAPGEAKYWLPALF 
TDLYSQE ITAEEAREALP 


5644 


B3 


1138 


PRRMGS W VQL1 TSVGVQQNHPGWTVAGQFQEKXK FTEEVJ EYFQ 
XKVSPVHLKILLTSDEAWKRFVRVAELPRESADALYEALKNLTP 
YVAIEDKDMQOKEQ0FREWFLKEFPC1 RWKIQES 3 ERLRVIANE 
I EKVKRGCVI ANWSGSTG1 LS VIGVMLAPFTAGLSLS I TAAGV 
GLGIASATAGI ASS 1 VENTYTR SAELTAS RLTAT S TDQLEALRD 
I LHDIT PNVLS FALD FDEATKM I ANDVHTLRRS KATVGR PLIAW 
RYVPINWETLRTRGAPTR1VRKVARNLGKATSGVLWLDWNL 
VQDSLDLHKGEKSES AELLRQWAQELEENLNELTH 2 KOSLKAG 


5645 


53"? 


799 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASSQVPTL 
YLCLQKSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTSPHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS I PATKRAS FLSSFIK>jFFEEL>EYILGF 
LSLLKFHVHVSVYSAICHFQKEGTGNSRSFTCTPELFPRLQTHL 
RAEGGAQ 


5647 


286 


800 


GV I MATSELS CEVS EENCERREA FWAEWKD1/TLS7R PEEGCSLH 
EEDTQRHETYKQQGQCOVLVQRS PWLMMRMG I LGRGLQE YQLP Y 
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BNSDOCJD: <WO__0153312A1_I_> 



WO 01/53332 



PCT/USOO/34263 



SEO 
ID 
NO: 


Predicted 
beginning 
nucl eotice 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
j ocat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sagnal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycir.e, 
KsHistidine, I-Xsoleucine, K»Lysine, 
L=l>eucine, M=Methionine, N«Asoaragint., 
P=Prcline, Q«Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








0RVLPLP1FTPAKWGATKEEREDTPI0LQELLALETALGGQCVD 
RQEVAEITKQLPPVVPVSKPGALRRSLSRSMSQEAQRG 


5648 


1 


1518 


VLSELCGRHEALREVGAEWPPPTCSPKICSGLCQAGNTOWSLTM 
AP0SLPSSR!4APLGMLLGLLMAACFTFCLSH0NliKEFALTNPEK 
SSTXETERKETKAEEELDAEVLEVFHPTHEW0ALQPGOAVPAGS 
KVRLNLOTGEREAKLQYEDKFRNNLKGKRLDIOTNTYTSODLKS 
ALAK FKEGAEMES SKEDKARQAE VKRLFRP I E ELKKDFDE LNW 
lETDMQIMVRLINKFNSSSSSLEEKIAALFDLEyYVKQMDNAQD 
LL»S FGG LQ W I NG LNS TEPLVKEY AAFVbGAAFSSN P K VQVEAI 
BGGALQKLLVrLATEQPLTAKKKVLFALCSbLRHFPYAGRQFLK 
LGGLOVLRTLVOEKGTEVLAVRWTLIiYDLVTEKNFAEfTEAELT 
0EMSPEKL0QYR0VHLLPGLWEC<5WCE1TVHLLALPEHDAREKV 
LOTLGVLLTTCRDRYRQDPQLGRTLASl-OAEYQVLASLELODGE 
DEGYFOELLGSV^SLLKELR 


5649 


in? 


3006 


KLQEQLDAINEE 1 RMI QEEKESTELRAEEl ETRVTSGSMEALNL 
K0LRKRGS I PTSLTDbSLASAS PPLSGRSTP KLTSRSAAODLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPAuSOEEGKSALEDOGSNPSSSNSSUDSLHKGAKR 
KG I KSS IGRLFGKKJEKGRLIQLSRDGATGHVLLTDSEFSKOEPM 
VFAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 

Lhrt VorJrAW I VAAL KAN V KouAi M£>Atf£>lJ 1 r,l\JHhl\9l i: iVAJjriK 
LKLRliA 1 QEMVS LTSPSAPPTSRTS SGNVWVTKEEMETLETSTK 
TDSEEGSWA0TLAYGDMNHEWIGNEWLPSLGLPOYRSYFMECLV 
DARMLDHLTKK0LRVHLKMVDSFHRTSLOYG1MCLKRLKYDRKE 
LEKRREESQHE1KDVLVWTNDQWHWQSIGLRDYAGNLKESGV 
HS ALLA1>D ENFDHNTLALI L»Q I PTQNT0ARQ VMERE FNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHKLSAFRI 


5650 


1172 


3006 


MLOEOLDAINEEIRMIQEEKESTELRABEIETRVTSGSMEAXjNL 
KQLRKRGS I PTSLTDLSLASASPPLSGRSTPKLTSRSAAODLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQEEGKSALEDOGSNPSSSNSSODSLHKGAKR 
KGIKSSIGRLFGKKEKGRilQLSRDGATGHVLLTDSEFSMOEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAOWDGPTWSWL 

Ciij/T VVjrlir/iW X VAAV-.KAJN VJ\OUaX Mo/iljoiJ 1 C X >'•*<£• ^wAOi\>UUnri 

LKLRLAI QEMVSLTS PSAPPTS RTSSGNVWVTHEEMETLE TS TK 
TDSEEGSWAOTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKBLRVHLKMVpSFHRTSLQYGlMCLKRLNYDRKE 
LEXRREESOHEIKDVLWTNDQVNmWVQSIGLRDYAGNLHESGV 
HGALLALDENFDHNTUU.lLQIPTOOTQARQVMEREFNNLlxALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREKHGRGGMLSASAETLPA 
G FRVS TLGTLQP P PAP PKKIM PEAKS H YLYGHMLS AFRD 


5651 


646 


1869 


AR0GORQPWG * EARAKG PASES PRV* EGSGWEGPASP *TPGSTL 
AWGEGAGIR* ASGLTAAGAASAAAA/PPPTRGGPAPAGCGRAPP 
WPAPLiRVPTHGRAPAPRSRAAPRAPAliSHGTAAAALSPASPAGP 
ADP * LPGKS SQSPPRG * RWGRSRSAPAPAHPEH PAPAGSAS ASQ 
QTPGWPGSCCLAQGWQAEPIX3APGAEDG\PVPPCRGFPbGTLGS 
PAGS WAGLAG YG * AGAPGTQATAPRAAGQT P VAAAPNCR V * GSA 
PALHRAPAAADPGSPLOAPPRAWASPAAAGPGLSSSDYCGGLGA 
GWRAGISPEL)X5AAGLSDNWARCPGPGPAE*GG0PGCRTIPASA 
CMPSP?VEGSLGLSRKGHGPLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 


5652 


735 


343 


HHKKYQHIHOKSFSCPEPACGKSFNFKKHLKEHKKIiHSDTRDYI 
CEFCARSFRTSSNLVIHRRIHTGEKPLQCEICX5FTCR0KASLNW 
HORKHAETVAALRFPCEFCGKRFEKPDSVAAHRSXSHPAILLA 


5653 


66 


1401 


RG RL0S RGR LTLGLVLLL LD I LGARQHGQRVS HG WKGG FLTAPL 
CFPQPCQPGTRRGRRRSLKEATEPQLAMAEEFVTLKDVGKOFTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPKLTSHPDGSED 
LEPLAGGSPEATSPDVTETKNSPLMEDFFEEGFSQEI/SRDVIQ 
GW1LELQFRRSLYRGHLVR*FARRSRKSSEV*YCHQRGKSHGMQ 



356 



8NSOOCID: <WO__0153312A1 J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO : 


Predicted 
beoinninc 
nuc j eotiatf 
location 
corresponding ■ 
to iirst 
amir«o acid 
residue of 

sequence 


Predicted end 
nucleotide 
loca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secuence 


Ammo acid segment containing signal peptide 
<A^Alanine, C=Cysteine f D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G— Glycine, 
K=Histidine, Islsoleucine, K^Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, -R=Arginine, 
£=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown, *=Stop 
loocii, /— poasiDJ-c nucicoiiuc cJcicLiOJ], 
\=possible nucleotide insertion) 








ES*ZKHRTQSCVHRFHGRRFHG\DNVSEKTLTPAKSKEYRGEFP 
SYSDHSQQI)SVQ£GEXPYQCSECGXSFSGSYRLTQHVa?HTREK 
FTVHQECEOG FDRKASHSG Y PKTHTG YKFYVCNEYGT P FSQSTY 

ECGKAFTR 3 FHLTRHQKIHTRKRYECSKCQATFKLRKHIjIQHQK 
THAANV 


5554 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGKVSJNQRRDSVRMSAL 
NWKPFVYGGLAS 1TAECGTFP1 DLTKTRFQI 0GQ7NDAKFXEI I 
yRGMLKALVRIGREEGLKALYSG'VGLKAFLCHCSLFHMGIDFR 
PRLHRSCVXSLRCV* KEQ1A* */MFSLLISTLI SKY3 YYAADVL 


5655 


2 


867 


RPPG I RAPRQLHPAAGRRPDASARPRFRPTVLLHDPFOLS FPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMIPFKDEGDPQ\REKIFAEIVNPEEEGDLADIKSSLVNES 
EI I PASNGKEVARQAQTSQEPYHDKAREHPDDGKHPDGGLYKKG 
PSYSSYSGY3MMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHPLTPLITYSDEHFSPGSHPSHI PSDVNSKQGMSRKPPAPDI 
PTFYPLSPGGGGQITPPLGWQGQP 


5656 


228 


1066 


PRRVPPLPE FASG PGAAF FH SGRLQRS LTKDS AGC FSQCRS RAM 
hVLRSGhTKAhASRTLZPQVCSSFXTGPRQVVGTrYEFRTYYLK 
PSNMN Ac MENLKKN IHLRTS YSELVGFVfSVEFGGRTNKV FHIWK 
YDKFPHRAEVRKAIANCKEWQECS 1 1 PNLARI DKQETEITYLI P 
WSKLQKP PKEG V YELAVFOMKPGG PALWGDAFERAI N AHVNLGY 
TKWGVFHTEYGELNRVKVLWWNESADSRAAVRHKSHEDPISWG 
GVRESVNYL\VSQQNM 


5657 


105 


1052 


GC-RLOSPRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSOTES EEES S EMDDED YER RRS ECVS EMLDLEKQ FSELKEXL 
FRERLS0LRLRLEEVGAERAPEYTEPLGGLQRSLKIR10VAGIY 
KGFCLDVI RNKYECEIjQGAKQHLESEKLLl.YDTbQGELQER IQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
P Y I VYMLOEI DI LEDWTA3 KKARAAVSPOKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALWfTPPL 


5658 


2346 


2 541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALLKCTDTELQLRRDA I FCQALVAAVCTFSEQLLAALGY 
RYNNNGE Y EES SRDASRXWLEQVAATGVLLHCQSLLS PATVKEE 
RTMLED I WVTLS ELDNVTFS FKQLDEN YVANTKVFYH IEG SRQA 
LXVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLEMVEGLPS 
PGSOAAEDUX)DlNAQSLEKVQQYYRKLRAPYLERSNLPTDAST 
TAVKIDQLIRPINALDELCRLMKSFVHPKPGAAGSVGAGLIPIS 
SELCYRLGACOMVMCGTGMQRSTLSVSLEQAAILARSHGLIiPKC 

1 Ply A lul PIK ft\A3 rKvtii LiA IVW L»Xw M/y PI ir y>s>Ur K U I IvLA- y r IU*U<I 

GDL 


5653 


2 


696 


WKRSGEVS PKGELGAWRGNSGRPKI IGRAAEAENEDRTLGRLLP 
GNERSQPRSPLRLIAPQLKAEAAADKGLAPVPPPFSSGHSGPCX 
ER EGE GQRGRGRSRRGAH LELXPS PGLRAGAPTDRGRGGPAEVA 
Rk££!R R MVOICRCORTT ,EFRE ^ELS <5NPAA£AGASLEP PAAPAPG 
EDNPAGAGGXAAVAGAAGGARRFLCGWEGFYGRPVrVHEORKEL 
FRRLQKWEL^TYL 


5660 


229 


853 


PVTMWAFSELPMPLLINLIVSLIiGFVATVTLIPAFRGHFIAARL 
rnnni NK^'Q^nnTPP^nr^VT <?fi2x\7P7.T ilfcftpfpflncfvke 
0RKAFPHHEFVALIGALIAICCMIFIX5FADD\nLNLRWRHKLLLP 
TAAS LPLLMVY FTN FGNTTI WPKPFRPI LGLHLDLGR * S YHCC 
PYGTYFREPFLVLHILLQVFLFCLCVFPDPFW 


5661 


2 


473 


LlJLYPSPCGGIPKLPGLPREAAAALGASFLAEAPIiPVTVRGSGL 
AG HAVTCD P KAFLS I CFVTLVFI.QLPLAS I CQN * GTDSCASRG K 
ADFDVTGPHAPIIAMAGGHVELQCQ1.FPNISAEDKELRWYRC0P 
SLAVHMKERGMDMDGEQKWQYRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKHKPSAKLSVRDALGAONASGERIKIOGWIRSVRSQXSVLF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first, 
amino acic 
residue d 
amino acic 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
K=Histidine, j=3soleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, JUArginine, 
S=Serine, T-Threonine, V* valine, 
K=Tryptcphan, Y^Tyrosine, JUUnknown, *=$top 
Codon # /=po39ible nucleotide deletion, 
\=possible nucleotide insertion} 








LHVNDGSSLESLOVVADSGL.DSRELTFGSSVEVQGOLIKSPSKR 
0NVELKAEXIKV3 GNCDAXDFPI KY KERHPLE YLRQYPHFRCRT 
NVI>GSI LRI RSEATAAIHSFFKDSGFVHIHTPI ITSNDSEGAGE 
LFQLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQS R RHLAEFYM I E AE I S F VPS LQDLMQVI E ELFK 
ATTMMVLSKCPED\ r ELCHK?IAPGOKDRL*HMLXNNFLIISrrE 
AVEILKQASQNFTFTPEWGADLRTEHEKYLVKKCGNIPVFVINY 
FLTLKP FYMRDNEDG POELEGS VA*HSLGLMILLSIWIGQP 


5663 


US 


692 


PADIGRSTAXTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL 
VQSYFEKGPLTFRDVAIEFSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLG IALTKPDLITCLEQGKE PWWI KRHEMVAKPPVI CSItFP 
QDLWAEQDI KDSFQEAI LKKYGKYGHANFQLQXGCKSVDECKVH 
KEHDNKLNQCLIPKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTGYSLVQENGQRKYG 
GPPPGWDAAPPERGCEIF1GKLPRDLFEDELIPLCEKIGKIYEM 
RMMMDFNGTWRGYAFVTFSNKVEAKNAIKOWJNYEIRNGRLLGV 
CAS VDNCRLFVGG I PKTKK 


5665 


347 


702 


WQHL 1 1 LLHCERTS P AM I TS EL P VLQDSTNETTAHSDAGS ELE 
ETEVKGXRKRGRPGRPPSTNKKPRKSPGEKSRIEAGIRGAGRGR 
ANGHPQQNGEGEPVTLFEVVKLGKSAMQRC 


5666 


213 


540 


VSCLPTSCICMITIiNNQDQPVPFNSSHPDEYKIAALVFYSCIFII 
GLFVN1TALWVPSCTTKKRTTVT2 YMM.WALVDLI FI MTLPFRM 
FYYAKDEWPFGEYFCQI LGA 


5667 


1 


695 


HPLPSASLGLPSVSLGVSLCVRSALLFJVWPMLPKRRUARVGSP 
SGDAASSTPPSTRFPGVA1YLVEPRMGRSRRAFLTGLARSKGFR 
VLDACS S EATHWME ETS AE EAVSWQE R RMAAAP PGCTP P ALLE 
I S WLTESLGAGQPVPVECRHRLEVAG P S KGPIiS PAWMPAYACQR 
PTPIiTHHNTGLSEALEILAEAAGFEGSEGRLLTFCRAASVLKAL 
PSPVTTLSQLQ 


5668 


691 


894 


CS FLFCI PDLFLQFLLGRKEEEAVLVGGEWSPSLDGLDPQADPQ 
VLVRTAI RCAQAQTG I DLSGCTKK 


5669 


407 


1 


DSGAPEGbSFLMSTOEGLSMHAHPOAYTPFlYLHARKRRGEIGD 
ADSRFNDR YAHKSAQLY FLY FVCW I FQDVY Y FTI KEKNHFFFP K 
ARGAPTKYSGS PIGS PTTTPPTRF PS FNLHPAPHLLASMQliQKL 
NSC 


5670 


3 


373 


SSECLTMAWIPLLLPLL1LCTVSVASYELAQPSSVSVSPGOTAK 
I TCSGD VLAKKYAR W FQQK PGQAPVLV I YKDTERPSG I PER FSG 
£ TSGTTVTLTI SGAQVEDEAD Y FCYSATDNFLWVF 


5671 


280 


524 


KFPPKKTPPHLCMESAITLWQrbLOLLLDQKHEHLICWTSNDGE 
FIOjLKAKKVAKLVIGLRKNKTNMNYDKLSRALRbbFMT 


5672 


2 


557 


F VPATPDPG VWLP P SRDPAMA K RSSIjY I RI VEGKNL PA KD J TGS 
SDPYC3 VKVDNEP1 1 RTATVWKTLCPFWGEEYQVHLPPTFHAVA 
FYVMDEDAbSRDDVIGKVCLTRDTlASHPKGKFSLPSHTGLPSP 
WPPSKSETSPLGS VWS PAQGKPFLLS PE AGAT FCT PGLCSAACS 
0AWLLLPLP 


5673 


327 


696 


ITVADQI SHWSAGR I KNRTRI PEClHSSAATTbAGPHTMEGESV " 
KLSSQTLI QAGDDEKNQRT ITVN PAKMG KAFKVMNELRSKQLLC 
DVMIVAEDVEIEAHRVVLAACSPYFCAWFrGDMS 


5674 


17 


984 


GGGSMEGESTSAVLSGFVLGALAFOHLNTDSDTEGFLLGEVKGE 
AKNS ITDSQMDDVE WYTIDI OKYIPCYQLFSFYNSSGEVNEQA 
LKKILSNVKKNWGWYKFRRHSDQ1MTFRERLLHKNLQEHFSNQ 
DLVFLLLTPSIITESCSTHRLEHSbYKPQKGIiFHRVPLWANLG 
MSEQIX5YKTVSGSCMSTGFSRAVOTHSSKFFEEDGSLXEVHKIN 
EMYASLQEELKSICKKVEDSEQAVDKLVKDVNRLKREIEKRRGA 
QIQAAREKNIQKDPOENIFLCQALRTFFPNSEFLHSCVMSLKID 
MFLKVAVTTTTISK 


5675 


80 


753 


EGSRRGPTRLARIiSARAGRLHFPPGFSSRLIHFRGVSECRRPPG 
KSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRMREPVRKVT 
LIif^VGLDNAGKTATAKGIQGEYPSPVAPrVGFSKIWIiRQGKFEV 
TI FDLGGGIRIRG2 WKNYYAESYGVIFWDSSDEERMEETKEAM 
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SEQ 
ID 
NO: 


Predicted 
beginnina 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted tnd 
nucleotide 
location 
corresponding 
to first: 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Pbenylaianine, G=Glycine, 
H=Kistidine, Isisoleucine, K^Lysine, 
L=L.eucine, K=Methionine, N=Asparagine, 
P=?roline, Q-Glut amine, R«=Arginine, 
S«Serine, T==Threonine , V^Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SEKLRHPR ISGKFI LVLANKQDKEGALG EADVIECLSLEKLVNE 
HKCL 


5676 




930 


FVS S PPPR P VQPAR PGG FGLSGRRSbLCQVASTPAHVGVMRSP V 
RDLARN DG EESTDRTPLLPGAPRAEAAP VC CS ARY NLA I LAPFG 
FFIVYALRVNliSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
KNQTGKKYQWDAETQGWI ICS FFYGY 1 1 TQI ?GGY VASKIGGKM 
LLG FGI 1>GTAVLTLFTP I AADLGVGPLI VLRALEGLGEGVTFPA 
MHAHWSSWAPPLERSKbLSlSYAGAQLGTVISLPLSGIICYYMN 

wtyvfyffgtig1pwfllwiwlvsdtpqfj1krishyekeyilss 

l ; 


5677 


j 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSbSROGERSGGKLVAAARAA 
VTAETHPLPLLAPLAVCOSVKSPAACCVRPRPRAVALPAALGGP 
GRSLPGLTAATMSSFSESALEKKLSELSNSOQSVQTLSLWLIKH 
RKHAGPIVSWHRELRKAKSNRiaTFLYLANDVIONSKRKGPEF ' 
TREFESVL\nDAFSHVAREADEGCKKPLERLLNIWOERSVYGGEF 
1 QQLKLSME DS KS P P P KATEEKKSLKRTFQQ I QEEEDD DYPGSY • 
S PQDPSAGPLLTEELI KALQDLENAASGDATVRQKI ASL.PQBVQ 
DVS LI^EK I TDKSAAERLSKTVDEACURKRGPGTS 


567B 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHbSGAGEAAPCPG 
GSGRTAAPRTRADFAAQSLMIMNKMKNFKRRFSLSVPRTETIEE 
SIiAEFTEOFNQLHNRRNENbOLGPliGRDPPQECSTFSPTDSGEE 
PGOI.SPGVOFQRR0K0RRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQ PDFDVS KRLS1>PMDI 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYI MEP S I FNTLKRY FQ AGGS PENVIQL 
LSE1JYTAVAOTVNLLAEWL1QTGVEPVQVOETVENHLKSLLIKH 
FDPRKADSIFTETCETPAWLEQMIAHTTVvTOLFYKLAEAHPDCL 
MLNFTVKVGRVLELRRKVFMNVYFWLLVCFL 


5680 


256 


592 


RRLTSTSEKLQNRNSHTPLESLIHPQPSYKGFGIMFGKKKKK1E 
ISGPSNFEHRVHTGFDPQEOKFTGLPOQVmSLLADTANRPKPMV 
DPS C I TPIQLAPMKTI VRGNKPC 


5681 


45 


869 


LLCAKTLGVRTKESQAEGYNRSG1NNHQAEDPRFCPSFCWMRSA 
RQTRPQRLRKEAARPFTPGSCPGGTGMDGKKCSVWKFLPI.VFTL 
FTS AGLW 1 V YF I AVEDDK I LPLNS AER KPGV KHAP Y I S 1 AGDDP 
PAS CVFSQVMNWAAFLjALWAVLR Kl QbKPKVLNPWLNI SGLVA 
bCLASFGMTLLGNFQLTNDEEIHNVGTSLTFGFGTLTCWIQAAL 
TLKVNI KNEGRRVGI PRV I LSASI TLCVGPLLHPHGPKHPHVCS 
OGPVGPGHVL 


5682 


39 


622 


PSRSCLGTMRKWRHREVNLPSVTQODAVCPAPIPSPGLSAQTGL 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGIiLLLVPLbLLPGSYGL 
PFYNGFYySNSANDONLGNGHGKDIiLNGVKLWETPEETLFTYO 
GASV3 LPCRYRYEPALVS PRR VRVKW WXLSENGAP EKDVLVAI G 
LRHRS FGDYQGRVHLROD 


5683 


89 


778 


' GSCGATALITRCLAWSVLIS RLAMATY TCITCRVAFRDADMQRA 
HYKTDi-mRYT^LRRKVASMAPVTAEGFOERVRAQRAVAEEESKGS 
ATY CT VCS KKFAS FNAY ENHbKSRRH VELEKKAVQAVNRKVEMM 
NEKKLEKGLGVDSVDKEAMNAAIQQAIKAQPSMSPKKAPPAPAK 
E ARNWAVGTGGRGTHBRD P S E KP PRLQ WFEQQ AKKLAKHSEDD 
SSDEEHDLC 


5684 


195 


677 


TWCFRG YLGPRVIMXALDEP PYLTVGTDVSAKYRGAFCEAKI KT 
AKRLVKVKVTFRHDSSTVEVQDDHIKGPLKVGAIVEVKNLDGAY 
03AV I NKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
ZjDOLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 


779 


1262 


LLLOOPVVHCFLLFPPFRFSHHMIFGPPGPHTTGIPHPAIVTPQ 
VKOEHPHTDSDLMHVKPQHEORKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKES AAINQI LGRRKHALSREEQAKYYELARKE 
RQLHMQLYPGWSARDNYVSPSS IPVALHS 


5686 


128 


1181 


CTWWQVNITLLD I NDNHPTWKDAP Y Y 1 NLVEMTPPDSDVTTWA 
VDPDI^ENGTLVYSIOPPNKFYSI^STTGKIRTTHAMXjDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Amino acid segment confining signal peptide 
"(A^Alanine, C=Cys'ceine, D^Aspartic Acid, E- 
Glutamic Acid, F^Fhenyi alanine, G=Glycine, 
HcHistidine, I=Isoleucine, K=Lyeine, 
L=Leucine, M=Methionine , N«Asparagine , 
P^Proline, Q=Glutamine, R=Arginine ( 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, »=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








qnlpfvaevleg 1 pagvs iyqwa i dldeglngl vs y rmp vgmp 
rmdflinsssgwvttteldreezaeyolrwasdagtptksst 
£TLtihvldvndetptffpavyk\'svsedvpr\gsgwsg*aarn 
ndvg ln aels y f i tgg nvdgkf s vg yrdawrt wg ldrett aa 
ymli lea2 dkg ? vgkrhtgtat v fvtvldvndkrpi il0ss yv 


5687 


17 


917 


AAPPAPPDG /FPP/PPPAPPT/PG PAA/APASS CQPRLS AGRAA 
OGDGGAAAVGHVLWPAVGPVRVNPGLQTPVPRPELLPGP\SSS 
LHSDSS YP PDAGLSDDEEP PDAS LF PDPPPLTVP / ADA/ PMPVT ■ 
SGCRM P STSASE / AAGGQG ACTKA KGS ETP P PAS ? QTS E PA? S P 
LPPHLTGGPGMYSSEAKLPNSFSCLGLAGTGAG1 * GTASAHGTG 
PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDLFGKCYRLLKTGIEHGAMPEQVGVYWYS/CLYDSRKLFF 
* SHMI IRS LL* KV I DDS LGQLPLLRELLL* * LNVI DRC2 1 LAYV 
LR VEKTFAI TYLKNFTVK VDFSLLGE I PLI SMAAI LKLWI MKID 
DGYIPAVF 


5609 


1504 


3 


HELSGKHI SMYSGNTCNWH PGGHS PGGGGQGEI TSKDRGBI PAL 
I WA/RKPI GTWTATKPTHRAG* GGAEEYQPPFQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDOHPHS 
PGEHRPSG \£ PLPACPPRAW PKAGA VASATGTG \ PQLPGSRGKQ 
KL>PRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDSSAS\ 
PQTPGGRGSLEWGLPLY LGPHHDVX* RSDRLG* PP * GGQGGGGH 
GAP£TPGPGGEAW*LPQQTSRPKFGPQAY*GE\GSPGLQCPCSK 
EL*RVPPGSLGPSTQCKYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSASSSHR*GG*ERARAGAGHRGST*A 
SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 
GS PPPA*ASAGRKGTVSTLGGGLL 


569C 


1424 


58 


PS PFAGVCAA PAPLPLLALARRDRR PCSPGAEAAPWQTGGPAI D 
GAWRTSVSALRRGATG/APCSPGAEAAPWQTGGPAIDG\DGELP 
* VRSEEAPRGCGAEGGGPGSGPVRK PGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLORLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAAL PER TRG VAE P P AWAHAGS DA WRAGR * SQRT * ERAR P R H 
PTFOGRAGS\GOPGYQPPNPHPGPSSPPAAP\GPRGA*GNPOLE 
KAPRS DRN P SQGLRTR I R R PET PDCGP P S P AGSS AS ASTFR CTS 
SLSLLG P/ PGAHNLDTAPODR * HGP* GDKRGAPGVAGEDPRPP * 
GNFVR* LLLMP/GVA* RHGTS PFLGPSLGENGGQWDSGNLFGTP 
KG * SHPAFTKST* SMEAEKS Y WNH PUR \DRGRQGVR INCLRVGE 
S EM WG P YS A P R PGT V FbS S FL S PAS EEH \ PEGSS S FNTPFPPAG 
PEGDPGLNS PGLLP 


5691 


107 


£50 


1SNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGILSFIED 
VAHRMLATGECTPEDLCFSLQVMO* KTGTESWG*RFYIVEQN* S 
GDAPLIFS P YLSLTGNCG FAMLVE I TERAMAH\ CGSPGGPSLWG 
GVGVYVLLESVPLSYS 


5692 


1193 


548 


TOAWTRAEKDRKGSVRALRLHLERGPPT*RGSHPL\QSVPCIOK 
PS I FSSYP I /GLPQSGGEPGPVGE0QPVRRPEQPSCGPAS3MPL 
TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 
RLNLPVMGATRSKLQPPRKVAVPGPTR*RDQDSKQDFSSKPLCS 
VPGLASTQOTLTPADS G PGTGGR DATRAGLPG VETMGNG VD 


5693 


1258 


1330 


ALTWPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP 
*OAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRWS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


i33e 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 
ORCL\NNLSSEEFNASSSLNSLPST?TASRRNSTIVLRTDSEKR 
SLAESGLS W FS ES EE KAP X KLE YDSGS LKMEPGTSKWRRERPES 
CDD S S KGGELKKP I SLGHPGSLKKG KTP PVAVTS PITKTAQS AL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
I AR PS TSGS FG YKKP PPATGTATVMQTGG£AT1»S KI QKS SG I P V j 
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SEQ 
ID 
NO: 


Precictec 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D^ASpartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G=Glycinc, 

L^Leucine, M=Meth.i onine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threcnane, V=Valine, 
W-Tryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /^possibie nucleotide deletion, 
\ ^possible nucleotide insertion) 








KPVNGRKTSLDVSNSASPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSS I DPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDR EKEKAKA KAVALDSDNI SLKS IGS PESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKPPSL-ANLDKVNSNSLDLPSSS 
DTTQCI 


5695 


3 


i33e 


GSKEPARSLKRRGSGKKSSAGKWGSVTLSTAGALG* KQLHQ* WT 
ORCZ>\NNl/SSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELXKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 

IN. v JWj J\. It £ A>i 1 L/3\Kj lv Li/i v JVM lljJjyXooojyi4oKlJKJUol//\jT\J\^t'0\J 

IARPSTSGSFGYKKPP P ATGTATVMQTGGSATLS K I Q KS S G I P V 
KPWGRKTSLDVSKSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKSKAKAKAVALDSDNISLKS3GSPESTPKN0ASH 
PTATKIAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5696 


3 


1338 


GS KE PARS LHRRGSGHKS S AGKWGS VTLSTAGALG * KQLKQ* WT 
0RCL \ NNLS S E EFNAS S S LNS L PS TP TASRRNS TI VLRTD S EKR 
ELAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDS S KGGE LKKP ISLGK PGSLKKGKTPPVAVTS P ITKTAQSAL 

IARPSTSGSFGYXKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
K PVTJGR KTS LDVSNSAE PG FLAPGARSNIQYRSLPRPAKS SSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVAU5SDNISLKSIGSPESTPKN0ASH 
PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALSPPACPSAPAPRRSIISRLFGTSPATEAAPPPPEPVPAA 
QGPATVQS VEDFVPDDRLDRS FLEDTTPARDE KKVG AKAAQQDS 

SSEEEAEVAAPTKGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGP I AAQMLS FVMDDPD FES EGSDTQRRADDFPVRDDPSDVTDE 
DEG PAE P P PF P KLP LPA FRL KNDSDLFGLGLEE AG P KES S E EG K 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 

« «< W rC * * KoiX t K. i Art 


5698 


2 


666 


G AEAAE PQEDL PPLSOSSRF FQEQQ KMNKSIjGP V S F KDVAVD FT 
OEEWQQLDPEQKI TYRDVMLENYSNLVSVGYHI I KPDVISKLEQ 
GEEPWIVEGEFLLQSYFDEVWQTDDLIERIQEEENKPSRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
RASSEYISSDGRYARMKAJ)ECSGCGKSLIiHIKLEKTH?GDOAYE 
FNQ 


5699 


2 


1448 


RVROPPGL.WVRRTVPAMOCPAGLSRVPGVAG/DPSLPSFRGPRD 
EAAHRGTIOTARHTRKLYVQGPASGPPLPRVSTQVAI*DEKPLA 
R PS / GRTN APF PQGQK P AGKAAPG P AAAGRVAMR \ P GHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL*RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS*HLDPNT 
WTQKWTGE/ SPAPGEEG\VAPAPRGPTAEHGHCELTTESOYSNN 
VPILFONPSGALRSRRTEPAGWVPPTRHE+DDG*TAAPASGGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRLAAGABTSASARCPPAAAA 
G WQ PR R PG FAG RAAL PG P PK P PS S * RELGGLPG PG W * TLD P LPA 
H P AHPPG S A P PWGALGG W AAARAS LPWS PS LCLS FPAVTP VAGL 
FPPGRG 


5700 


923 


597 


NGHKGVWEINIY*RRSNIHKNSKSESHLNQDHSFPPPTPNSARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSliIKAILRVSVLSE 


5701 


59 


410 


3 FE Kl CSDTOE FI S PE I NPQ1 CSWLI FDKGAK/NHATGKDS LFN 
KWSWKNWLSTCR*MRPG?YFTPYTKINSK* IK/DANIRCETVKL 
LEENTGENLKDTGLGNVF LDMTPKTQPTKQK 
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BNSDOCID: <WO Q1S3312A1 J_> 



WO 03/5331? 



PCT/US00/34263 



SEQ 
ID 
NO: 


Preci cted 
beginning 
nucleotide 
locet ion 
ccrre sponding 
to first 
amine acrid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alonine, C=Cysteine, D=Aspar-ic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Giycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=/-.sparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, VWVaJine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +»Stop 
Codon, /=poseible nucleotide deletion, 
\»possible nucleotide insertion) 


5702 


2 


1517 ' 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVIT?SRASESSASSDGLH?VITPSKASESSA 
SSDGPHPVITPSRASESSASSDGPHPV1TPSRASESSASSDGLK 
PVITPSKASESSASSDGPHPVITPSWSPGSDVTLLAEALVTVTN 
1 EVI K CSI TEI ETTTSS I PGASDTDL1 PTEGVKASSTSDPPALP 
DSTEA K PHI TEVTASAETLSTAGTTE SAAPHATVGTPLPTNSAT 
E RE VT A PG ATTLSG AL VT VSRN P LEETS A LS VETPS Y V KV SGAA 
PVS I EAGSAVGKTTSFAGSSASS YSPSEAALKNFTPSETLTMDI 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATP7TARTRPTT\A*VQVKM£VSSSCG"* VVJLPRKTSLTPEWQ 
KG + OS SSTGNSTPTRLTS RSP YCVSG EANG / PSAAARHV P YAKR 
GCCP*PGPPPTDCSCVTVLRGTQKVPMKGSMSKPLTPDVATGPS 
LTS TG VY VWG GAS P VPRG VLG LTLAH VL C F S KEKT 


5703 


14 


1117 


HHKDSRSOGLPRTOECARPELRPLbCPRALWPVTRLSYRCPWQA 
P KAG 1 G TKAKP S ES HLKLH PG WP S LDRQG E P ATLGTGTGH CSD S 
RILRWHP* HTAAR* PRWRRLPSSHRWTRHLGVLRVODKS * * VSL 
DPSCRPRFLRTC**YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS * WCPWL+ AARWTGWRTASGAS AGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR* H * TAGAPAS VRSS0GATRSPAPGGDQC 
ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


22 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVECDORITAEEAISHEWI 
SGNAASDKNI KDG VCAQ I E KNFARAKWKKAVRVTTLMKR LRAPE 
QSSTAAAOSASATDTATPGAAGGATAAAA.SGATSAPEGDAARAA 
KSDNVAPRRP*LPPOP0KEVPPQPLMAVSP0PPMEASLQPLMGE 
SPQP 


5705 


23 


562 


GDYEFDSPYWDDISCAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAAS DKK I KDGVCAQ I E KNFARAKWKKAVR VTTLMKRLRAPE 
QSS TAAAOS AS ATDTAT PG AAGGATAAAAS G ATS A PEGD AARAA 
KSDMVAPRRP * LPPQPQMEVPFQPLMAVS PQPPMEASLQPLMGE 
SPQP 


5706 


3161 


610 


0LGRFXAODTVA1 RKVKEVFGTGAMRHWI LFTHKED* GGQALD 
DYVANTDNCSLKDLVRECERRYCAFNNWGSVEEQRQQOAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYOAKVE 
WQVE K>1 XQE LR EN ESKWA Y KALLRVKH LKLLH Y E I FVFLLLCSI 
LFFI I FLF 


5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGTRHHOGRAEPEPDAPERAPLRR* 
MFAIQPGLAEGGQFLGDPPPGLCQPELOPDSKSNFMASAKDANE 
NWHGM PGR V EPILRRSSSESP SDNQAFQAPGS P E EG VR S P P EGA 
EIPGAEPEKMGGAGTVCSPL2DNGYASSSLS1DSRSSSPEPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGKPGQPRPYLDLPA 
Q AS VS R P KDRA* G EAVS LS LS SGDVCGHTDGGGAG S D PGAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLL1AIRYSD1PSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGRQGSGG*AGGGGP\ARTHADLPCVGFVCSPP 
LLK* S DS P VKQLP A\SGQGS GAGMPP VGS S DI LR PR P TS VSGTG 
RAAG* CSWQPAACCTPRSQ* WAVARSPSRCSRW* RQSGR+RG+ S 
S RRR RGP * AAGR S T PAV P * P CS * GGAG RRA YACRTG W G YA PSR * 
LEPSGPTSG5AL* TWASHSTGA* * SRLCGTAGTGPLCSQSSRS * 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTOPRAPSAH 
GRGRAMGSRCVCTCTGLPCPGIPLSGASPGGSGETGAGRSHTLK 
AARSRLS PRPGSGSRGSY* SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTSSDH* YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS /SGGASGS PA^SCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2C31 


I TLCPLPOTEKCLNWTEAATPLGI YLKAR VEAGGLKELEI S WG 
LHQI WR WGAWMRAGMGGCR CWGVMAPFAPK/KALS FLVNDCS 
LI HNNVCMAAVFVDRAGEWKLGGLDYM YSA9GNGGGPPRKGI PE 
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PNSDOCID: <WO 0153312A1_L> 



X*' 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

LUl A C ^ vnv* J ijy 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, G= Cysteine, D=Aspart ic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 

L=Leucine, M=Methionir ; e, N=Aspar«gine , 
P=Proline, 0=Gautamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codcn, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








LEOYDPPEIiAJ^SSGRWREKRSADMWRIjGCLIWEVFWGPLPRAA 
ALRNPGKIPKTLVFHYCELVGANPKVHPNPARFLQN T CRAPGGFM 
SNHFVETNLFLEE1QIKEFAEKQXFF0ELSKSLDAFFEDFCRHK 
VLPOuLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKI I PWVK 
MFS S7DRAMR I RLLQQMEQF1 QYLDEPTVNTQI FPHWHGFLDT 
NPA 1 REQTVKSMLL.LAPKLNE AKLNVELMKH FARLOAKDEQG PI 
R fNTTV PT JTi K t CZ <? vt . C A C TR HR VLT*? A F ^R ATR D 5 FAP <5 R V AGV 
LGFAATHNL YSMNDCAQKI LPVLCGLTVDPE KS VRDQAFKA I RS. 
FLS KLES VS EDPTOLE E VEKDVHAAS S PGMGGAAAS WAGWAVTG 
VSS L7SKL3RSHFTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DV3 GOV ^ R A <5 n V c; \ T P TTMP PN PO ^ PTG AAG K \ ^GLLG TG LA 
GAKLPGATS * RYTAGORV 


5710 


1 


562 


I PGST I SCEV ELMARKAKT I DS FTQNQTRLWI I DGLDACECDK 
VLOMLDTVRVLFS KG P F I AI FASDPHI I IKA3 NQNLNSVPSGFK 

OILOGYRKKLTEEFHRTALGR*QNLVAROPS1DG*DAIGPELYV 
CIA10FNTNKDDAT 


5711 


1526 


1130 


RRKPFQWTTVTOEAFSKHDVAFTSTPVLFYPDSAQPFIVKSESS 
S01AKAVLSQ0RPSLFHECAFHFFS*SLQRHT1NLD0GIP*LLM 
L5EER0HLFESS/ I WTTPHNLK* / FEIHEHLGSHEGHWTLFFLL 
OIL 


5712 


3 


1391 


GRKLFOSLDISERLKFLbTbDCVDDTLIVLAEEHGCLDHKELP 

ETV1 0 J #T rN K CLTFH PSKR PTPDELM KDX VFS E VS PLY TPPTK PA 

SLFSSSLRCADLTLPEDISQLCKD1NWDYLAERSIEEVYYLWCL 

AGGDLEKELVNKEI I RSKPPICTLPNFLFEDGESFGOGRDRSS/ 

i FR* YHWDIWMPAKK*IERCWGRSILPITbKMT5bl 

N ELS AAATLPL 1 1 R EKETE YQLNR 1 1 LFDRLLKAY P Y KKNQ I WK 

E AR VD 2 P PLMRGLT W AALLG VEG A I HAK YDA I DKDT P I PTDR 0 1 

L.VU1 r Atriy X iJ^ijLioiJrtorlAAr KKV ±jJ\l>aWv vt/irl/ljV I W\J*jJ-iL» 

SLCAPFLYLNFNNEALVYACMSAF1PKYLYNFFLKDNSHVIQEY 

LTVFSOMIAFHDPEI.SNHLNEIGF1PDLYAIPWFLTMFTHVFPL 

HKIFHLW\DTLLLGEFLFPILYWE 


o 1x3 






j- 1 vow r VL/KWt' vijt J KiiL/yiioyyij <tti\.jjr / i\L'r i<-k r vAiJSJfnr/on 
T ACR C S RRG AQVQH L P RE D I RAAE * D PH LRE VK PGL PTSS ATS P 
♦ RAVLTS PCSHLGSADAASSHWLCGVSFH 


5714 


212 


613 


V?GLGLGPTMSSLGGGSQCAGGSSSSSTNGSGGSGSSG?KAGAAD 
KS AWAAAAPAS VABDTPPPERRNKS G 1 1 SE PLN KS LRRSRPLS 
H YSSFGSSGGSGGGS MMGGES ADKATAAAAAASLIJuVGHDLAAA 
MA 


5715 


131 


1979 


esascqkrskcliltlklelsgsapkktsarpgsslwlpphsqe 
0tppaskl0ggggglq1x5kglhpvpvtaasplprwclfgavak\ 
glpgp * lcpsgaa/gglqrgpglspligaagkvsclhppsmvenw 
dstchekhegilaarvtpvpXsgkpgrvlkppgrvcrpphpaas 

PRPPGS / SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 
QIRRKETKPL* RKTP AG \NNYQS N S I PVSQS PQLTVDLLP SAG R 
TOAPSGRGDAGKPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC+EV\GALGEPVRIPG 
L* PDLSCILSNGSKHRREGLSFPRSIjGPGRRGPAGLQSLGCSPT 
P KNTA CH S SGHVALQ AGHDS ARDVGSGHVALOAGHDSTQDVGR P 
VMR WI FLE * LGLSRETGQATRRGLVW I S PGRAAAACVACAQALE 
EGPLRLPGQDRGAQPCSHCPGRAAGQFEPGAGAPCRE/GG*DPT 
GLT/ GVPGTDPKRGGRKPGQSGQETQGPTVWSG PES P LQPKP * E 
ROE / VGAGASSGVG LS RGRAGGPSS AWEVAAMLLLLRKGSHSE L 
TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEFHR1,CEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL* SGFFTI 2 VGGYSCCMPLXT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDOLMGFERDSE 
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BNSDOCID: <WO_0153312A1_L> 



WO 01/53312 



PCT/DSOO/34263 



SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino ocid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine. D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine , G=Glycine, 

L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, C^GIutamine, R=Arginine, 
S»Serine, T=Threonine # V=Vaiine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GDSLGARPGLPYGLSDDESGGGRALSAESEVE3PARGPG2ARGE 
R PG P ACQLCGGPTGEG PCCG AGG P GGG PLL P P RLLYS CR LCTF V 

EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSP^EOFfJAVPRRPFnAIjIi'.PDIi^IjlWPPGfiA^FT.PTjf'G 
Q\ CGVKGRA S AGLDQNHCQS / SLFPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
H. H M KTHSG E K P FRCAR C P YAS AH LDNLKRHQR VHTGE K P YKC PL 
CPYACGNIANLKRKGRIHSGDKPFRCSLaJYSCNOSMNLIRHM 


5718 


12 0 


284 


VAHALSLPAESYGNDVSMTHPOLPPTOLAWDLCRTCLPbSYNFT 
S+*STADPLHL 


5719 


46 


426 


ELNNG PFQKPLCNGGNLAVTGS WADRSPLHEA ASQGR LLALRTL 
LS0GYNVNAVTLDHVTPLHEACLGDHVACAR7LLEAGAJJVNAIT 
IDGVrPLFNACSQGSPSCAELLLEYGAXAQP\ESCLPSP 


5720 


i 


1051 


LOAFRNASEVPMVLVGTQDAISAA\NPRVYRRTSRARKLSTDLK 
\RCT\YYE\TCGGTYGLQMWSV5FQDVA0KWA1j\RKK0Q\1AI 
GPCK\bJjPI* \orSH \SAVSAAt I PAKAPIJNUGHE/ SGGG5AFSLJ 
Y\SSSVPSTPSISQRELRIETIAASSTPTP1RKQSKRRSNIFTS 
RKGADP\DREKKAAGCKVDSIGSGRAIP1KQGILI»KRSGKSLNK 

KRLPRATPATAPGTSPRANGLSVERSNTOLGGGTGAPHSASSAS 
LHSERPLSSSAWAGPRPEGLHCHSCSVSSADQWSEATTSLPPGM 
CKPASG 


5721 


97 


-192 


RHSSPCCSLRRTERSSNAAVST/TTVQQFKRFIENYRRHIGCVA 
VFYAI AGGLFLERAYYYAFAAHHTGITDTTRVG I ILSRGTAAS1 
SFMFSYILliTMCRNTLilTFLRETFbNRYVPFDAAVPFHRLlASTA 


5722 


88 


1042 


VALDVLAGS S PGGGMAGAl.bG PRVHG I RAVLRVARGGVQAPGAP 
GSLGVSHAAA PPARPQGAAQS PHRGRRHGGGGAG1.P PPRSPRFP 

qesvpaststargprrvsrrlppqhpgprgrrrrpgagvgaprr 
grargqagllgrqgqgg rg aereraalq arrgrr pg pepdqs cg 
grprraaaapgrapadpqppaprfapapdvrppadapapapapa 
pp p p phlgaltagsgeerqsqpraetlrlgrc- aplp \ praergg 
rpkoaeqqqNpkrptppargpossgdpamlporaglrtgglagt 

KSSTREIPEMI 


5723 


ee 


1043 


VALDVLAGSS PGGGMAGALLG PRVKG I RAVLRVARGGVQAPGAP 
GbLCjVoHAAAPPARPQGAAySPHKORKHGGCKj P 
QESVPASTSTARGPRRVSRRLPPQKPGPRGRRRRPGAGVGAPRR 
GPJU^GQAGl^GRQGOGGRGAERERAALOARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQP PAP RP APAPD VRPPADAPAPAPAPA 
P PPP PHLGALTAGSGEERQS Q?RAETLRLGRGAPLP\ PRAERGG 
RPKC/AEQQQX PKRPTPPARGPQS SGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1841 


FTNEAPPAPLFDASASPt^PHRRAKSLDRRSTEPSVTPDLLNFK 
KGWLTKQYEK3QWKKHWFALAX)0SLRyYRDSVAEEAADLDGEID 
LSACYDVTEYPVORNYGFOIHTKEGEFTLSAMTSGIRRNWIQTI 
MKHVHPTTAPDVTSSLPEEKNKS5CSFETCPRPTEKQEAELGEP 
DPEOKRSRARE\RRREGRSKTFDWAEFRPIQQALAQERVGGVGF 
ADTH \DPWR PEAEHGELERSRARRREERR KRFGMLDATDGPGTE 
DAALRMEVDRS PGLPMSDI.KTHhTVHVE 3 EQR WHQVETTPLREEK 
QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSOKEASDLLEQ 

nrllcdqlrvalgreqsaregyvlqatcergfaameethqkkie 
dl0rqhqreleklreekdrllaeetaat3 sai eamknahreeme 
releksqrsq1ssvnsdvealkr0yleelqsvqrelevlse0ys 
qkclenajii*aoaleaerqalrqcqre^oelna:^oelniwlaae 
itrlrtlltgdgggeatgsplaqgkdayelevpsgarpcltqlc 

TOEPOGSAAWPLSYRWGGTDLRQQESOGPGRSKSPEGGEEQ 


5725 


3 


1049 


VNGHSEETSQS PNRTEPHDSDCS VDLG 3 S KSTEDLS PQKSGPVG 
S WKSHS 1 7NM EIGGLK I YD I LS DN \ DLS SHLQPLK/ FTS A VCG 
KN I VR SKAATLLYDQPLQVFTG S S S S SDL I SGTKAI FKFDS NHN 
PE / G AXYN KR ? HKWAHN LHLKYMVL HS 1 1 SKTVAV\ RSQRHFVA 
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BNSDOCID: <WO_0153312A1_L> 



WO 01/53312 



PCT/US00/34263 



SEQ 
Jl 
NC: 


Predicted 
beginninc 
nucleotide 
1 fif p t i on 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 

nucleotide 

location 

rnrrp RIOTld i TkQ 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«=Glycine, 
K— Pistidinp T — *Ifirtl ptici np KsliVsi np 

L=Leucine, M«Nethionine , N»Asparagine , 
P=Proline, Q=Glutamine, R=Arcinine, 
S= Serine, T=Threonine, V=Valine, 
Vf=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Cocon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LOTKSPNRPCQFSSSAPS/VDQRAQ/INOSYAKHSANMNFSNHN 
NVRANTAYHLHQRLGPARHGEKWAI S PNDRLI PAVTRSTIQRQS 
SVSSTASV74LGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQRPLSARTYSIDGPNASRPQSARPSINE1PERTMSVSDFNYSR 
TSP 


5726 


2 


486 


SRSLSMWV*;NSGLPASSHSSKLPVTVGFSGCVKRLRLHGRPLX3AP 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
QGSPGGSGEGPPLSSPSQPLPADLPGATLPDVGLELEVRPLAVT 
GL1 FKLGOARTPPYLOLQVTEKO^LRADDG 


5727 


21 


221 


RP I LI LKETRRLPWATGYAEVINAGKSTHNEDOASCEVLTVKKK 
AGAVTSTPNRNSSKRRSSLPNGE 


572 6 


2 


877 


NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSO 
GG PAGAGGDAG / LPGRCPS APWRAGS RP AASCPDWI PGPQGLWL 
HRNPTS/GPPSQ1GEGAEQGDEGVADAP0IQCKN/GAEDPPAED 
E P PQV PEAGE EDAVPAE EG PGGT P E TQADQVR ER PEAHLAEGG A 
KGSPRRLADPQDLPAGQMSLAPFFPPVAAVIRSNK 


572£ 


1 


1525 


AGG AR E VLTLQLGHFAGFVGAH WWNQQDAALGRATDSKE P PGEL 
CPDVLYRTGRTLHGQETYTPRLILMDLKGSLSSLKEEGGLYRDK 
OLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 
KSlPNGKGSSPLPTATTPKPLIFTEASIRVWSDFbRVHLHPRSl 
CM I QK YNHDGEAGRLEAFGQGESVLKEPKYOEELEDRLHFYVEE 
CD Y iA? u> r v 1 LCDLiH DGFSG VGAKAAEliby UiiYSGRGIl 1 WGijIjF 
G F YK RGE AQRN I YR LLNTAFGLVHLTAH SSLVC P1»S LGGSLGLR 
PE PPVS FPYLHYDATLPFHCSAI LATAbDTVTCS \ YRLCSS PVS 
MVHL\ ADMLS FCG KKWTAGAI I PFPLAPGOSLPDSLMQFGGAT 

OK T P T . Q & nfiFP CP TOPF H Ci C \7\71 T r»T? ft PWTQOT /PPf^TPP P Q A 

LHACTTGEE I IAQYLQQQQPGVMS S SHLLLTPCRVAPPYPHLFS 
SCS PPGMVLDGS PKGAAVES VPVFG 


573 0 


1258 




LS LG T Y AS LHGR I Y CKFHFNQL FKS KGN Y DEG FGHR PHKDLWAT 
KI ETEGFHERPRNFENCGRPLKSFGGEDCPSC* GGCPGSNY * AQ 
GSS S REKGGQAS WN PKLRVA 


573 3 


122 


443 


RSKRGELIPKDSCYMRKPPRRPKKRRQG/CALPQGCLTFKDVAI 
KPGRGRGKQRRQEWFFLRVY 


573 2 


226 


772 


P PS RS C QS PRRKS RRRAHVT.VTLV CG FTS FS F S LPLYLCGCLR F 
PERTCSQLQOADWAPDFGPSSFVPSWGATATGARKFLIAFNl\N 
T ,TiGTKEOAHR2AI.NTjR EfiGRGKDOPGRLKXVOGIGWYLDEKNLA 
QVSTNl^LDFEVTALHTVYEETCREAQELSLPWGSQLVGLVPLX 
ALLDAA 


5731- 


1 


460 


PALQE VNANALAWGKQYENDARTLFE FTSGVNDTESP 1 1 YRDES 
MRTACS PDGLCS DGNGLEL KCP FTSR D FM KFRLGGFEAI KS AYM 
AQVOYSMWVTRKNAWYFAKYDPRMKREGLHYVVIERDEKYM\AS 
FDEI \VP\EFIGKMDEVLSRDPK 


5734 


3 


968 


RCNS PESLTSLUVLLTTANNLFVLI PAYS KNRAYAI F?I VFTVI 
G S L FLMN L LTA 1 1 YS Q FRG YLM KSLQTS L FRRRLGTRAAFE VLS 
S MVG EGGA FPQAVG VKPQNLLQ VLQKVQLDSSH KQAMMEKVRS Y 
GSVLLSAEEFQKIjFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
HYYFDYl/SNLIAIJuNLVSICVFLVLE'ADVLPAERDDFIIiGILNC 
VFI VY YIjLE.^IxLKVFALGLRGYLS Y PSNV FDGLLTVVLLiVLE I S 
TL\VCTDCHTQAGGRRWW/RLLSLWDMTRHLNMLIVFRFl»RIIP 
SMK PMA WAS TVLGL 


573L 


2 


54C 


FFTPCVARAFNFPDQATVKKAAYSLPRVGGGTSCGLPQARRISL 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNT1AGRTYNDLNQ 
YPVFPW VliTNTYESEELDLTLPGNFRDLS KP I GALNPKRAVFYAE 
RYETWEDDQSPPYHYNTHYSTATSTLSWbVRI VS I FIELACLWY 
" LX1LT 


573 fc 




382 


GTRPSTKKSGYSPQQVAVIHCKGHOKENTAVAHSNQKADSAAQV 
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SEQ 
ID 

WO: 


Predictec 
beginninc 
nucleotide 
1 oca t ior. 
corresponding 
to first 
arr.ino acid 
residue of 
arr.ino acid 
sequence 


Predicted end 

nucleotide 

location 

pnr' rp cnnn c\ i no 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA=Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Fhenylalanine, G=Glycine, 
H=Hi st i dine / l~lsoleucine, K— Lysine, 
L=Leucine, M=Methicnine, N=Asparagine, 
PsProline, Q=GIutarr.ine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
K=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








TARLSA/'TPPNLLPTVSFPOPDLPDMPVYSTTTEKLASDLRANO! 
QES * * 1 L PDSGI FI P * T * TS YLQSTTHLRRAKLPQLLRR 


5737 


29 0 


1041 


KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRANLGPCRRKR 
U3TLNIRLAAGFQYSSHKDPSLSAKEKHTDYHNEARGPWPGWVG* 
RTADGS CGRGPDGAHHPGPKSSSWRAS RLLPGLGGSHHLDAYVG 
RDLECGTPAPLQLEI PPOPRGHPAPIPTGQAGPRDSGPGASP* V 
ETR PLTDGR R * PGVRPVGWTP AHPAGTLRPRGAVEPS VSACGKW 
APSPTS0GCCEGRCDAVPKJ1RAWRTPLCSQ 


5736 


e 


460 


DTLSLNCTLPETLPMTPSF* LSFL* FPGIiARAKSIPTKTYSNEV 
VTLWYR PPD I LLGSTDYSTO 1 DMW * GQVEVWQGPCGKGGGLVTT 
ATQPAAFLFTVPSLPRGVGCIFYEMATGRPLFPGSTVEEQLHFI 
FRIISEEAWALCAVETHR 


5735 


j 


1222 


S FQRRG I RWNVHTLHPHPRAV WAGI GRGHGS * ALLGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDIjLAEV 
SAEVDGPVPGYLSS P0S I TDTCLYI FTSGTTGIjPKAARISHLKI 

i f\r*f\r > wr\'\ /^r**.njr^i?T^\7T vt Til "di vuMcr ct t <n.~r\ir^ f^Wf 1 f^TSTXT 
liyuyvsr iyLit-V3vn^r,JL»vl i Ij/UjJrljIrii'lovjoJLuAa.L vVjurjIjJ.VjAil V 

VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HKVRIAVGSGLRPDTWERFVRRFGPLQVLETYGLTEGNVATINY 

TrnDrnvrDicuT winrncci lovTMrrTrrBTDnrnou^MATc 
JOyKGAVGKAoWIjiKHlr Fr blilKjUv J. IL>r*Pl KUrwoHC-iuA i b 

PGE PGLLVAP VSQCS P FLG Y AGG PE LAQG KLLKD VFR PGD VF FN 

TRDLLVCDDOGFLR FHDRTGD P FR W KGENVATTEVAE V FEALD F 

LQEVNVYGVTV 


5740 




231 


PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
LQSELHKLYDEETQSWVSGSACGGYP 


5741 


1 


650 


PRKTMRRGVbMTLLQOSAMTLPLWIGKPGDRPPPLCGAIPASGD 
lv P*Kr^u)\vJ\PiKv ]SJ\vIj\j I I r. WD J. Lie 

EGKERHTLSRRRVIPLPQVJKANPETDPEALFQKEQLVIALYPQT 
TCFYRALT HAPPQRPQDDYS VLFEDTS YADGYS PPLNV/iQRYW 
ACKEPKKK* CRLADSPS PNDTGODSRGRAGI KH I PPLKKK 


574 2 


2 




TY VN I PDRSGDTVLI GAVRGGHVE I VRAI»LQKYADID1 RGQDNK 
TALYWAVEKGNATMVRDI LOCNPDTE I CTKDG 


5743 


2 


415 


G KTP EG I DA I EE I E I DLEETER EI S PQENGLE E VKPLGEMQTDL 
KATGREI S PREKTPEVIDATEEIDKDLEETGRREI SPEENGPEE 
VKPVDEMETDLKTTGREGSSREKTREVIDAAEVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTROMTTTPAALPTTWTTPDLTTGTPLOMTTIA 
VFTTANTCLSLTPSTLPEEATGLLTPEPSKEGPILTAESETVLP 
SDSWSS AESTS ADTVLLTS KES KVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGI PMS MKNEMPI SQLLMI 1 AP 
Sl^FVLFALFVAFl^RGKLMETYCSQKHTRLDYIGD^KWLNDV 
OHGREDEDGLFTL 


5745 


1400 


595 


GK<SR PVNT .MICH^ K KT Y FO DFT jEI")Y I KVOKARGLEP KTCFR KM 

KGDYLETCGYKGEVNSRPTYRMFDQRIiPSETIQTYPRSCNIPOT 
VENRLPQWLPAHDSRLRIjDSLSYCQFTRDCFSEKPVPLNFNQOE 
YICGSHGVEHRVYKHFSSDNSTSTHOASHKQIHOKRKRHPEEGR 
EKSEEERSKKKRKKSCEE1DL.DKHKSIQRKKTEVEIETVHVSTE 
KLKNRKEKKSRDWS KKEERKRTKKKKEOGQERTEEEMLVJDQS I 
LGF 


5746 


3 


821 


S FASGRLT P S S P A FDGELDLQR Y SNGP AVSAWSLGMGAVSWSES 
RAGERRFPCPVCGKR FRFNS I LALHLRTHQPERPRS PAARLLLE 
LEERALLREARLGRARSSGGMOATPATEGIiARPQAPSSSAFRCP 
YCKGKFRTSAERERHLHILKRPWKCGLCSFGSSQEEELLHHSLT 
AHGAPERPLAATSAAPPPQPOPQPPPQPEPRSVPQPEPEPQPER 
EATPTP AP AAPEEP PAPPEFRCQVOGQS FTQSWFLKGHMRKHKA 
SFDHACPV 


5747 


2 


1328 


DRHVETLCIHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRODVDTEPOKRKTE 
ESSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWETNRDD 
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i SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Precjctec end 
nucleotide 
locat i on 
cor re^ uond iug 
to first 
amine acid 
residue of 
amino acid 
sequence 


Amino acid secment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acad, F= Phenyl alanine, G=Glycine, 
H=Histidine, 3=Isoleucine, K=Lysine, 
L=Leucine, M=Mp.thion.i.ne, N^Asparagine, 
P=Proline, Q=G3utamine, R=Arginine, 
SsSerine, T -Threonine, V*Valine, 
W>=Tryptophan # Y=Tyrosinc, X^Unknown, *=Stop 
Codon, /=poBsible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDINYDrVHELSLEMKROKIORELMKLEQENMEKREElIIK 
KEVSPEWRSKLSPSPSL2KSSKSPKRKSSPKSSSASKKDRKTS 
AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIALGKKyKE 
KY KVKDR I E E KT R DG KDRG 3 DFERQR3KRDKPRS TS P AGQHHS P 
ISSRHHSSSS0SGSS1QRHSPSPRRKRTPSPSYQRTLTPPLRRS 
AS PYPSHSLSS PQRKQS PPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREySODOSSSRDHRDDREPRDGRDR 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLJAlFPYAGLQPSCySSLKHLYKWAIPAEG 
KKNENLQNLLCG SGAGVI S KTLTY PLDLFKKRLQVGGFEHARAA 
FGOVRRYKGLMDCAKOVLOKEGALGFFKGLSPSLLKAALSTGFM 
FFS YEFFCNVFF! CMNRTASQR 


5749 


552 


3 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS 
SASSTYSSAEE R MQSEQ I R KLRR E LES SQE KVATLTSQLS ANAN 
LVAAFEQSLVNK'J'SRLRHliAETAEEKDTELLDLRETIDFLKKKN 
SEAQAV1QGAUJASETTPKELRIKRQNSSDSISSLNSITSHSSI 
GSSKDADA 


5750 


22 


bee 


IF1 SI CLWKAH L C FbLLP KDC 2 DQ VMKLQNLFVDDS GR YLAI QF 
IILEWAYVFLYYyEYRKAKDO^DIAKDISQLQIDLTGALGKRTRF 
QENYVAQLILDVRREGDVbSNCEFTPAPTPQBHLTKNLELNDDT 
I LND I KLADCEO F OMPDLCAEE I AI I LG I CTNFQKNN PVHTLTE 
VELLAFTSCbLSOPKFWAIOTSALILRTKLEKGSTRRVERAMRQ 
TCAlADQFEDKTTS VLERLK I FYCCQVPPHWAI QRQLASLLFEL 
GCTSSALQIFEKLEMWE 


5751 


3 


753 


SCGSALRAWRCGAAALATFPAPALPGbMYRALYAFRSAEPNALA 
FAAGETFbVLERSSAHWWIAARARSGETGYVPPAYbRRLQGLSQ 
DVLQAI DRAI E AVHNTAMRDGG KY S LEQRGVLQKL IHHRKETL»S 
RRGPSASSVAVMTSSTSDHKbDAAAARQPNGVCRAGFERQHSLP 
SSEHLGADGGLFOI PLPSS01 P PQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 


471 


GPVCGVGLSVAVJAG PWRG P VHS VGGGGRAALHGAELPCLSGAAT 
VEREMELRHKNEMLRVETEARARAKAERENADIIREQIRLKASE 
HRQTVLESIRTAGTliFGEGFRAFVTDRDKVTATVNIFIKQGWQV 
AERQHVG AS WS PRSCP CRLC7AL 


5753 


34 


483 


DDSXA3 PGGVQAF FGAVRN 1 Y 7PRTGHRIRKLDQ1 QSGSNYVAG 
GQEAFKKLNYLD3 GEI KKRPKEWNTEVKPVIHSRINVSARFRK 
PLQEPCTIFIilAIvGDLINPASRLLIPRKTLNQWDHVbQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


3 31 


TLVHWEFAGEKAEAIASREQEVIjOGWKELIjSACEDARIjHVSST 
ADALR FHS Q VRDLLS VJMDG 3ASQI GAADKPRCPS S LLGLPASPW 
WPTPATPS PLTAF FSME 


5755 


3 


bbe 


LGDOFY KEA1 EHCRS YNS RLCAE RS VRLPFLDSQTG VAQNNCY I 
WMEKRHRGPGLAPGOLYTYPARCWRKKRRLHPPEDPKLRLLEIK 
PEVELPLKKDGFTSESTTLEALbRGEGVEKKVDARBEESIOEIQ 
RVLENDENVEEGNEEEDLEED2PKRKNRTRGRARGSAGGRRRHD 
AASQEDHDKPYVCDICGKRYKNRPGLSYHYAHTHLASEEGDEAQ 
DQETRSPPNHRKEKHRPQKGPDGTVXPNNYCDFCIX5GSNMNKKS 
GRPEELVSCADCGRSAHLGGEGRKEKEAAA 


5756 


3 


6.21 


SSKLQAbFAHPI.Y NVPEEPP LU5AEDSLLASQEALRY YRRKVAR 
WNRRHKMYRSQMKLTSLDPPLOLRIiEASVn/QFHLG INRHGLYS R 
SS PWSKLLQDMRKFPTI SADY SQDEKALLGACDCTQ IVKPSGV 
HLKLVLRFSDFGKAMFKPMROORDSETPVDFFYFIDFQRHNAEI 
AAFHLDRILDFRRVPPTVGR3 VNVTKEIL 


5757 


3 


4 7 3 


YKDALLLPDNHRCWFENGTbKLTbvOKGMDEGEYLCSVLIQPQ 
LS ISQSVHVAV KV F PLI Q PFE FPPAS IGQbLY I PCWSSGDMP1 
RITWRKDGQVIISGSGVTIESKEFMSSLQISSVSLKHNGNYTC1 
ASNAAATVS RERQL I VRVPPR FW 


5758 


1 


474 


FRRGAGAERGEHKEGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNOSNRLAVSRTDGTVE I YNLS ANYFQEKFFPGHESRATEALC 
WAEGQRLFS AGLKGE I ME YDLQALN I KYAMDAFGGP I WS MAAS P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acic, F=Phenylalanine, G=Giycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparacine , 
P^Proline, G>Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«=Valine, 
W=Tryptophan, Y** Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






SGSQLLVGCEDGSVKLFQITPDKI PV 


5759 


2 


1240 

! 


GNAAFAGQGV V YETFHMS DLPSYTTNGTVHVWNNC 1 GFTTDPR 
MAR SS F Y PTDVAR VVN AP I FHVNADDPE AV I YVCS VAA E WRNT F 
NKDVGADLVCYRRRGHNEMDEPMFTQPLMYKQIHROVPVIiKKYA 
DKL I AEGTVTLQEFEEEI AKYDRI CEEAYGRSKDKK I LH I KHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDKLTHIGSVASSVPLEDF 
KIHTGLSRI LRGRADMTKNRTVDWALAEYMAFGSLLKEGI HVRL 
NGQDVERGTFSHRHHVLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
SSLSEYGVU3FELGy7J4ASPNALVLWEAQFGDFHNTA0CIIDC>F 
1STGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFL0MSNDD 
SDAYPAFTKDFEVSQL 


576C 


1 


1221 


VRDITSDSLSLSWTVPEGQFDKFLVQFKNGDGOPKAVRVPGHED 
GVTISGLEPDHXYKMNLYGFHGGQRVGPVSAVGLTAFGKDEEMA 
FASTEPPTPEPPIKFRLEELTVTDATPDSLSLSWTVFEGQFDHF 
LVQYKNGDGQ PKATR VPGHE DRVT I SGLEPDNKY KMN L YG FHGG 
CRVGPVSA3GVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SSPDSLSLSWTVPGGRFDSFTVQYKDRDGRPQWRVGGEESEVT 
VGGLEPGRKYKMHLYGLHEGRRVGPVSTVGVTAPOEDVDETPSF 
TEPGTSAPEPPEEPLLGELTVTGSSPDSLSLSWTVPOGRFDSFT 
VQ Y KDRDGR PQAVR VGG QE S K VTVRG1>EPGRKYXMKLY GLHE GR 
RLGPVSA1GVT 


5761 


3 


1275 


SCDMAEAAALVW I RGFGFGCKAVRCASGRCTVRDFI KRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQIEKTTNREACRDLSGRRLRDVNHEKAMAE'^VKOOAERE 
AEKEOKRLERIiORKt.VFPKHCPTSPDYOOOCHFMAFRl .FD^VT .K 
GMQAASS KMVSAE I SE7CRKRQWPTKSQTDRGASAGKRRCFWLGM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSG S QRAR WNT DHGS P EQLQ I P VTDSGRH I LEDS CAELGE S K 
EHKESRMVTETEETQEKKAESKEPIEEEPTGAGLNKI)KETEERT 
DGER VAE VAPEER ENVAVAKLQESQPGNAVIDKETI DLLAFTS V 
AELELLGLEKLKCELMALGliKCGGTLQ 


5762 


2 




GSTGQTPLHSQGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MSSEEAANGKKSHWAELEISGKVRSLSASLWSLTHLTALHLSDN 
SLSRIPSDIAKLHNLVYLDLSSNKIR 


5763 


3 


429 


LDKJDTGL1MLIARLDYELIQRFTLTIIARDGGGEETTGRVR3NV 
LDVNDNV PTFCKDAY VGALRENEPSVTQLVRLRATDEDS PPNNQ 
1TYSI VSASAFGS YFD3 SliYEGYGVISVSRPLDYEO 1 SNGLI YL 
TVMAMDAGN 


5764 


19 


441 


VCARACG EMRQLLRP I DRQR Y DENEDLS D VEE IVSVRGFS LEEK 
LRS QL YQG D FVHAM EGKDFN YE YVQRE ALR VP LI FRE K DGLG I K 
MFDPDFTVRDVKLLVGSRRLVDVMDVNTQKGT5MSMSQFVRYYE 
TPEAQRDKL 


5765 


3 


825 


QK I LRLNNSHQPPTS SS NSKDCGGPASSGAGATAALADGLKFAS 
VQASAPOGNSHKEI'SKSKVKRSKTSKDANKSLPSAALYGIPEIS 
S?G KRQE VOG R PGE ATG MNS ALGQS VS SGGSGN PNSN S TSTSTS 
AATAGAGSCGKSKEEKFGKSOSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPSGGHLYGFGAKSNGGGAS PFHCGGTG SGS VAAA 
GEVSKSAPDSGLMGNSMLVKKEEEEEESHRRIKKLKTEKVDPLF 
TVPAPPPHV 


5766 


1608 


663 


SGLFSVDPASSOAMELSDVTLI EG VGN EVM WAG WVL 1 LALVL 
AWLSTYVADSGSNQLLGAIVSAGDTSVLHLGHVDHLVAGQGNPE 
PTE L PH PS EGNDEKAE E AG EGRGDS TG EAG AGGG VEF £ LEHLLD 
IGGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKS KYFPGQESQMKLI YQGRLLODPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLM VP V FWLLG VVW Y FR I NYRQFFTAPATVS LVG VTV F F S FLV 
FGMYGR 


5767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYLDV7RALMKRKRMKANIKLVG 
SGFPLPSSDLDDSLTEEIDEKIGFRNDANFDWQNVADFRDAGGS 
LTE VKVEE EBRD PQS P E FE I EEEEEMLS S VI PDSRR EM E L PDFP 
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SEQ 
ID 
NO: 


Predicted \ 
beginninc 
nucleoside 
location 
corresponding 
to first: 
amine acid 
residue of 
amino acid 
sequence- 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Am: no acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid. E= 
Glutamic Acid, F=Fhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, 3= Unknown, *=Stop 
Cocon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion) 




i 

1 




HIDEFFTLNSTPSRSAYDEFHLLVNIEKQKLELEKRRLDIEAER 
LOVEKEKLOIEKERLRHLDMEHERLOLEKERLQIEREKLRLQIV 
NSEKPSLENELGOGEKSMLQPQDIETEKLKLERERLQLEKDRIiQ 
FLK?ESEKLQIEKERLQVEKDRLRIQKEGHLO 


5768 


- 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA 
AAKVA KDY P F Y LTVKRANCSLELP PASG PAKDAEEPSN KR VKPL 
SRVTSLANLI PPVKATPLKRFSQTLQRS1SFRSESRPDILAPRP 
KSRttAAPSS TKRRDSKLVJS ETFDVC 


576S 




667 


TKTKKGVKEKATDOSVKAFAEHCPEbQYVGFMGCSVTSKGVIHL 
TKLRNLSSLDLRHITELDNETAMEIVKRCKNLISLNLCbNWIIN 
DRCVEVXAKEGQNliKEliYLVSCKITDYALIAIGRYSMTIETVDV 
G W C K E I TDOG ATL1 AQS S X S LR YLGLMR CD K VN E VTV EQL. VQQ Y 
PK I TFSTVLODCKRTLERAYQMGWT PNMSAASS 


5770 


j 


484 


BSRRYDVKTRKWSFLLEEHSKLIAKVRCLPQVQLDPLP7TLTLA 
FASOLKKTS LSLTPDVPEADLSEVDPXLVSNLMP FQRAGVNFAI 
AKGGRLLLADDMGLGKTIOAICIAAFYRKEWPLLVWPSSVRFT 
WEOAFLRWLPSLSPDCINVWTGKDRLTA 


5771 


16 1 


741 


GLLPSACLKARSWREASEGPSSRACSNGSGUTFEACYSGTSTPS 
FHGSHCSGSDHSSLGLEQLQDYMVTLRSKLGPLEIQQFAMLLRE 
YRLGLPIQDYCTGLLKLYGDRRKFLLLGMRPFIPDQDIGYFEGF 
LEGVG I REGG I LTDS FGR I KRSMSSTSASAVRS YDGAAORPEAQ 
AFHRLLADITHDIS 


5772 


14 t 


383 


EFKlaALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
ILTFSNLVTCSAIYHLPVFPEREPGCSMRDLRVA 


j 5773 




723 


PR V R S KHNF C FMEMNTRLG VEHP VTEM I TGTDLVE WQLR I AAGE 
KI PLSOEEI Ti^QGHAFEAR 1 YAEDPSIWFWPVAGPLVHLSTPRA 
DPS TR I ETGVROGDEVSVHYDPM I AKLWWAADRQAALTKLR YS 
LROYN I VGLK TN I BFLLNLSGHPE FEAGNVHTDFI PQKKKCLLL 
SRKAAAKESLCQAALGL3LKEKAMTDTFTLQAHDQFSPFSSSSG 
RRLN 1 S YTRNMTLKDGKNS K 


5774 




592 


FVEEENIRWRCGGSELNFRRAVFSADSKYIFCVSGDFVKVYST 
VTEECVHILHGHRNLVTGIOLNPNNHliOLYSCSLDGTIKLWDYI 
DGIL1KTFIVGCKLHALFTLAQAEDSVFVIVNKEKPDIF0LVSV 
KLF K S S SQE V EAKELS FVLDY INQS PKCI AFGNEGVYVAAVREF 
YLSVYFFKKETTSRVTLSSS 


577S 


3 


538 


SSGCCDPAAPSSLAEAATMPVSKCPKKS3SLWKGWDRKA0RNGL 
RSOVYAVNGDYYVGEWKDNVKHGKGTOVWKKKGAIYEGDKKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDVJ C G S OR SG WGRMY YSNG D I YEGQWENDK PNG EGMLRLSQN P 
RP 


5776 




4 84 


RLP0DCVCONLSES1/5TLCPSKGLLFVPPDIDRRTVELRLGGNF 
IIHISRODFA^MTGLVDLTLSRNTISHIQPFSFLDLESLRSLHL 
DSNRL P S LG E DTLRGLVNLQHL I VNKNQLGG 1 ADE AFEDFLLTL 
EDLDL-S YNNLKGPAVGLRGDAW VQPS TS 


5777 




949 


GQDPEPGQDLFOPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ 
VXKLEOALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 
GGSGSEVSQRVHPSDLEGREPTPELVEDRKGSCRRPWDRSLENV 
YRGSEGSPTKPFJNPLPKPRRTFKHAGEGDKDGKPGIGFRKEKR 
NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLLQSS S ES GR\T)WYAQTKLGLTRTL£ E ENVY EDI LDP PMKENP 
YED1 ELHGRCLGKKCVLNFPASPTSSI PDTLTKOSLSKPAFFRQ 
NSERRNV 


5778 


_ 


1210 


QRROSVSRLLLPVFLLEPPAEPGLEPFPEEEGGEPAGVAEEPGS 
GG P C WLQLE E V PG PG PLGGGG PLRS PS SYS S DELS PG E P LTS PP 
WAPLGmPERPEkLLNR vLLKLiAG^AI RUb/wz>U±ij±jVVl VLtl r;£> 
LFLPTEKFLOELHQYFVRAGGMEGPEGLGRKOACLAMLLHFLDT 
YOGLLCEEEGAGH 1 1 KDLYLLI MKDESLYQGLREDTLRLHOLVE 
TVELKIPEEKOPPSKQVKPLFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMFDHSYVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 
LI LVAVSSSGE KVLLQPTEDCVFTALG INSHLFACTRDS YEALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
1 ncs t inn 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secment containing signal peptide 
<A=Aianine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 

H — K i «= t" i &~ Tie ] = T«5nlpnri KsLvqi np 

L?-- Leucine , M=.Methioninc, N=Asparagine, 
P=Pr*olinc, Q«=Glu tamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrcsine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 








PLPEE10VSPGDTEIHRVEPEDVANHLTAFHWELFRCVH3LEFV 
DYVFHGE 


5773 


138 


1571 


EAVQVL3 KHS AD VN ARD KN WQT P LHVAAAN KAVKCAE V 1 1 PLLS 
SVNVSDRGGRTAbHHAALNGHVEMl'NLLLAKGANINAFDKKDRR 
ALHWAAYMGHLDWALLlNKGAEVTCKDKKGYTPLHAAASNfGQl 
NW KHLLNLGVE1 DE I NVYGNTALH IACYNGQDAWNEL I DYGA 
NVNOPNNNGFTPLHFAAASTHGALCLELLV10NGADVNIQSKDGK 
oi'ijr.ni/ivMuKr x vji JbxyWvjor.iiJv_ vJJivLAjry l i'jjri va>\k ion 
ELLI KTLI TSGADTAKCGI HSM FPLHLAALNAHSDCCRKLLSSG 
QKYS 1 VSLFSNEHVL-SAGFEI DT PDKFGRTCLHAAAAGGNVECI 
KLLQS SG ADFHKKDKCGRTPLHYAAANCHFHCI ETLVTTGAN VN 

EATLC LE FLLQNDAN PSIRDKEGYNSI HY AAA YGHRQCLE LLLE 
RTNSGF5ESDSGATKSPLHLAVSEMP 


5780 


154 


624 


OFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEEKSEPVS 
EIETSWKGSHFPVGWPPRAKSPTPESST1ASYVTLRKTKKMM 
DLRTER PRSAVEQLCLAE STRPRMTVEEQMER I RRHCQ ACLREK 
KKGLNVIGASDQSPLQSPSNLRDNP 


5781 


19 


941 


RGSLGGHPWRP PMRAASQG CLP V S F VTG PKQ ERAYGGRG PGG AF 
P APP VS G TCP PDL I Y A PTPEKAEGG SQ KNHO P PPG ERAAHR DGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPI>GQ 
VQPHFTSODAKSAEDEAPSRHLGKHQPRSAOVGSRLDALQGPKT 
QHS3 HTVTCKSPRQKEDRSPKPPQAPKH PEEHGRQS\QAPPPLP 
VArSRTCGGC*TWDPALLVS?/PQGDSTPEl,PAP\QQPTGGPSR 
CRC AL P PQG * RQQ P RQR PR / PTGAS R S H PAKAKGCOGP PK I RNY 
NIMD 


5782 


5176 


1237 


DRSMMSMAADSYTDSYTDTYTEAYMVPPLPPEEPPTMPPLPPEE 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SOPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
P E P ES S I TLT PVE S A WAEEHE WPERP VTCKVSETPAMS AE PT 
VLASE P P VMSETAET FDSMRAS GH V AS E VSTS LLVPAVTT P VLA 
ES 1 LEP PAMAAPESS AMAVLES S AVTVLESS1 VTVLESSTVTVL 
EPS WTVPEPPWAE PDYVTI PVPWSALEPSVPVLEPAVSV7..Q 
PSMI VSEPSVSVQES TVTVSE PAVTVSEQTOVI PTEVAI ESTPM 
ILESSIMSSHVMKG1NLSSGDQNLAPEIGM0EIALHSGEEPHAE 
EHLKGDFYESEHGIKIDLN1NNHL1AKEMEKNTVCAAGTSPVGE 
1 GEEKI LPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 
ATG\TS KG I EFTTAS TLSLVNKYDVDLS LTTQDTEHDMLI STSP 
SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTNEPLPVKRD\DQ 
TLAALI \SLXESSGGEKEVPPPS* REHLPDSGFSANIEDINEAD 
LVR P VS S PRT WNVLP S PRAGL \ EG P \ LLAS D FGP VQNL Y SS P W 
\ S SMP\ ERASGS\SSGEKGG \ YE I FVKVXDTH EKSKKNXNRDKG 
EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 
HRS\C71'RSRSRS/RDRRRRSSRSRSKSRGRRSVSKEKRKRSPKH 
RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSKTPSRRRRSR 
S VGRRRS FS I S PS RRS RTPS RR S RTPSRR S RTPSRRSRT P SRRS 
FTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPLRRRFS 
RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 
GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 
DDDVIVNKPHVSDEEEEEPPFYHHPFKLSEPKPIFFNLNIAAAK 
PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
KDDDNVFS S NL PSEP VDI STAMS E RALAQ KR LS ENAFDLEAMS M 
LNRAQE R I DAWAQLJ* S I PGQFTGS TGVQVLTQEQLANTG AQAW1 
KKDQFLRAAPVTGGMGA\^MR^IGWREGEG1X3KNKEGNKEPILV 
DFKTDRKGLVAVGERAQKRSGKFSAAMKDLSGKHPVSALMEICN 
KRRWQP PE FLLVHDSG PDHRKHFLFR VL INGS AYQPNW FFLNR 
Y 


5783 


1693 


698 


DSGItRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
OKLGFGTG VWYZiMKRS PRGLSHS PWAVKKINP I CNDH YRS VYO 
KRLMDEAKI LKSLHH PN I VG YRAFTE ANDGS LCLAME YGG E KSL 
NDL1 EE/ PI * SQ/ PK J LFQQP/L I LKVALNMARGLKYLHQEKKL 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(A=Aianine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, Gs=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Kethionine, N«=Asparagine, 
P=Prol:ne, Q=Glutomine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan / Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGD3KSSNW3KGD7ET1KICDVGVSLPLDENMTVTDPEACYI 
GTEPVJKP KEAVEENGVITDKADI FAFGLTLW EMMTLS I PHINLS 
NDDDDE D KTFDES D FDDEA Y YAALGTRP P I NMEELDES YQKVI E 
LFSVCTNEDPKDRPSAAHIVEALSTDV 


5784 


2669 


1388 


PRVRPHVRTDHNYYISRiYGPSDSASRDLVfVNIDOMEKDKVKIH 
GILSNTHRQAARVKliSFDFPFYGHFLREI TVATGGFI YTGEWH 
RMLTATOYIAPLMAI^FDPSVSRNSTVRYFDNGTALWQWDHVHL 
QDNYNLGS FTFQATLLMDGR1I FGYKEI PVLVTQI S STNHPVKV 
GLSDAFVWHRIQO 1 PNVRRR7 I YEYKRVELQMSKI TNI SAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRODW 
VDSGCFSESKEKMCENTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E / DAVTS QFP TSLP T E DDTK I Al»H LKDNGAS TDDS AA E K KGGTL 
HAGLJ VGILI LVL 1 VATAILVT VYM YHH PTS AAS I FFI ERRPS R 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5765 


2659 


1386 


PRVRPRVRTDHNYY1SRIYGPSDSASRDLWVNID0MEKDKVKIH 
GI LSNTKRQAARVNLS FDFPFYGHFLRE1 TVATGGF I YTGEWH 
RMLTATOYIAPLMAMFDPSVSRNSTVRYFDNGTAliWQWDHVHL 
QDNYNLGSFTFOATLLMDGRI 3 FGYKEIPVLVTQISSTNHPVKV 
GLSDAFVWHRIQO I PNVRRRTI Y E YH R VELQM SKI TW I SAVEM 
TPLPTCL'OFNRCGFCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENT£PVET\FLEPPQP*ERQPPSSGS*LPP 
E /DAVTS QFPTSLFTE DDTK IALHLKDNG AS TDD SAAEKKGGTL 
KAGLIVG 1 LI LVLI VATAI LVTVYMYHHPTSAASIFF1ERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5786 


2532 


1674 


£YKLPAAERRASSCSQPPTPTRRRW?APGRTSRGHRPO;M*SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 
S*H*KRNLSQRSSSKSRRPLSCARPHR**RQGLTVAARLPTWAK 
SPPLACSFCQAA0KSOSLSSGRSTR*PERMSFR?\SPPGNPAIP 
SLAPSSRP/PKGRPOCTWIPSRWPASPTAPPTTT*APTSSPGST 
GRSMMTCFTRWTATFWS ARASSRPRNWPTP* WR PSGRLSTV* RA 
TGGSTATAPPKRFPRKWNPMMAE 


5787 


2 


1460 


MASAASVTS LADEVNCP \ I CQGTZiKEAGSLSNCG/HKKFCRACL 
T\RYCEIP\GPD\LEESP\TCP\LCKEPFRP\GSFRPNWOLANV 
VEN I ERLQLVSTLGLG EEDVCQEHGEKI Y FFCEDDEMQLCWCR 
EAGEHATHTMRFLEDAA\APYREQIHKCLKCLIKEREEIQEIQS 
RENKRWOVLLTQVSTKRQQVISEFAHLRKFLEEQQSILLA0L2S 
QDGDI LRGRDE FDLLVAGE I CRFSALI EELEE KNERPARBLLTD 
I RSTL I R CETRKCR KPVAVSPELGQRIRDF POOALPLORBiyiKMF 
LEKLCFELDYEPAHISLDPQTSHPKLLLSEDKORAQFSYKWQNS 
PDNP0RFDRATCVLAHTG1TGGRKTV7WSIDLAHGGSCTVGWS 
HDVQR KG ELRLRPE EG VWAVRLAWGFVS ALG S F P\ TR LTLKEQP 
RQVR VSL D YE VG WVTFTNA VTR EP I YTFTAS FTRKVI PFFGLWG 
RGSSFSLSS 


5788 


2 

! 


6860 


E HS VS GR S S AYGDATAEGH PAG PGS VS SS TGAI S TTTGHQEGDG 
S EGBGEG E TECDVHTS NRLHMVRLMLLERLLOTL PQLRNVGGVR 
AIPYMQV I LNJLTTDLDGEDEKDKGALDNLLSQL1 AELGMDKKDV 
SKKNERSAIJ^EVHLWMRLLSVFMSRTKSGSKSSICESSSLISS 
ATAAALLS S GAVDY CLHVLKS LLEY WKSQQNDEEPVATSQLLKP 
HTTSSPPDMSPFFLROYVKGHAADVFEAYTQLLTEMVLRLPYQI 
KKITDTK SRI PP PV FDHSWFYFLSEYLMIQQTP F VRROVRKLLL 
F I CGS KE K YRQLRDLHTLD S \ H VRG I KKLLEEQG I FLRAS WTA 
SPOSALQYDTLI SLMEHLKACAEI AAQRTINWQKFCI KDDS VLY 
FLLQVS FLVDEGVSPVLLQLLS CALCGS KVLRALAASSGSSSAS 
SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQEDQ 
LCTALVNOLNKFADKETLIOFLRCFLLESNSSSVRWQAHCLTLH 
I YRNSSKSOOELLLDLMWS I W PELPAYGRKAAQFVDLLGYFSLK 
TPQTEKKLKSYSQKAVEILRTQNHILTNHPNSNIYNTLSGLVEF 
DGYYLESBPCLVCNNPEVPFCYIKLSSIKVDTRYTTTQQWKLI 
GSHTISKVTVKIGDLKRTKMVRTINLYYNNRTVOAIVELKNKPA 
R WHKAKKVQLTPGQTE VKIDLPLP I VASNLMI EFADFYENYQAS 
TETLOCPRCSASVPANPGVCGNCGENVYOCHKCRSINYDEKDPF 
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SEO 
ID 

NC: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ol 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H^Histidine, Is] soleucine, K=Lysine, 
L=Leucine, M=Mer.hion.i.ne, N^Asparagine , 
?=Proline, Q=Gluc amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LCNACG FCKYARFDFK L.YAKPCCAVDP 3 ENE EDRKKAVSN I NTL 
LD KADR VYHQLMGHR P QLENLLC KVNE AA PE KPQDD SGTAGG I S 
STS ASVNRY I LQLAQE Y CGDCKNS FDELS KI IQKVFAS RKELLE 
YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 
CASAVTEHCITLLRALATNPALRHILVSQGLIRELFDYNIjRRGA 
AAMREE VRQ LMCLLTR DN PE ATQQMNDL I IGXVSTALKGHV3ANP 
DLASSLQYEMLLLTDS 1 S KED5 CWELRLRCALSLFLMAVNI KTP 
VWE^ITLMCLRILOKLIKPPAPTSKKKKDVPVEALTTVKPYCN 
EIHAQAQLWLXRDPKASYDAWKKCLPIRGIDGNGKAPSKSELRH 
LYI.TEKYVWRWKQFLS K RGKRTS PLDLKLGHNNWLRQVLFTPAT 
QAARQAAC7 I VEALAT IPSR KQQVLDLLTS YLDELS I AGECAAE 
YLALYQKLITSAHWKVYLAARGVLPYVGNLITKEIARLLALEEA 
TLSTDLQQGYALKSLTGLLSSFVEVESIKRHFKSRLVGTVLNGY 
LCLRKLWQRTKLIDETQDMLLEMLEDMTTGTESETKAFMAVCI 
ETAKRYNLDDYRTPVF 1 FERLCS I I YPEENEVTEFFVTLEKDPQ 
QEDFLQGRMPGNPYSSN E PG 1G PLMRDI KNKICQDCDLVALLED 
DSGMELLVNNKIISLDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 
LLGDATEEF I E SLDSTTDEEEDEEE VYKMAGVMAQCGGLECMLN 
RLAGIRDFKQGRHLLTVLLKLFSYCVKVKVNRQQLVKLEMNTLN 
VMLGTLNLALVAEQES KDSGGAAVAEQVLS 1 MEI \ IQAEPNVEP 
LSEDKGNLLLTGDKDOLWLIoDOINSTFVRSNPSVLCGLLRIIP j 
YLSFGEVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 
IAAGIK\NNSNGHQL\ KDL \ I LQKG I TONALD \ YMKKH I P/SAA 
RI WDADI \ W KS FCLR PAL P F I LRLLRG LAI QH PGTQVL I GTDS I 
PNLHKLEQVS\SDEG1GTLA\ENL\LESLREHPDVNKKIDA\AR 
RETRAEKJCRMAMAMROKALGTLG \ MTTMEKGQWD/TRTALLEA 
DWEELI EEP\GLTCCI CR EGYK FQPTKVLG I YTFTKRWLGGVW 
ENKPRETSRATSTVSHFN1VHYDC\HLA\AVSLARGREEWESAA 
LQNANTKCNGLLPVWGPKVPESAFATCLARHNTYLQECTGOREP 
TYQLNI HDI KLLFLRFAft EOS FSADTGGGGRESNI HLI PYIIHT 
GLYVLNTTRATS REEKN LOG FLEQ P KE KWVES AFEVDGP Y Y FTV 
LALHI LPPEQWRATRVEI LRRLLVTSQARAVAPGGATRLTDKAV 
KDYSAYRSSLLFWALVDL1YNMKKKVPTSNTEGGWSCSLAEYIR 
HNDMPIYEAADKALKTFQEEFMPVETFSEFLDVAGLLSEITDPE 
SFLKDLLNSVP 


5789 


1 


2407 


LPLHAVEKTGR PGQPAL KM PGKLRS DAG LESDTAMKXGETLRKQ 
TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSONDI SPKTKSLRKKKEPI EKKWSSKTKK 
VTKNEEPSEEEIDAPKPKKNKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFPI QAKTFHHVYSGKDL1 AQARTGTGKTFSFAI PL 
XEKLHG\ELODRKRGRAFQVLVLAPTRELANQVSKDFSDITKKL 
SVACFYGGTPYGGQFERKRNG I DI LVG7PGRIKDHIQNGKLDLT 
KLNHVVLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPHlfVFNV^KKYMKSTYEQVDLlGKKTQKTAITVEHLAIKCH 
WTQRAAVI GDVI RVYSGKQGR7I I FCETKKEAQELSONSAIKQD 
AQSLKGDI PQKOREI TLKGFRNGS FGVLVATNVAARGLD I PE VD 
LVIQSS PPKDVESYIHR SGRTGRAGRTGVCI CFYQHKEEYQLVQ 
VEQKAG I XFKRI GVPSATEI 1 KASSKDAI RLLDSVPPTAI SHFK 
QS AEKL I EEKGAVEALAAALAH I SGATS VDQRSLI NSNVGFVTM 
ILQCS I EMPNI S YAWKELKEQLGEEI DSKVKGMVFLKGKLGVCF 
DVPTASVTEI QEKWHDSR RWQLS VATEQPELEGPREGYGG FRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGORSGGGNKSNRSONK 
GQKRS FS KAFGQ 


5790 


3786 


1585 


ARRQRDPLQALRRRNQELKCX)VDS LLSESOLKEALEPNKRQHI Y 
QRCI QLKQAI DENKNALOKLS KADESAP VAN YNQRKEEEHTLLD 
KLTQQLQGLAVTI S REN 1 TEVGAPTEEEEESESEDSEDSGGEEE 
DAEE EEEE KE ENESHKWS TGE E Y I AVGD FTAQQVGDLTFKKGE I 
LLVIEKKPDGWW1AKDAKGNEGLVPRTYLEPYSEEEEGQESSEE 
GSEEDVEAVDETADGAEVK\ORTDPHWSAVQKAI SEAGI FCLVK 
HVSFCYLIVLMRNRMETVEDTNGSETGFRAWNVOSRGRIFLVSK 
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1 SSQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c o l t c spoil u i ng 

to first 
amino scid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Aianine, C=Cysteine, D=Aspart ic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine. 

n-ni o Llulllt: , JL-loOlcUl-lilc , JV — L«y& iile , 

L=Leucine, M=Methionine, N=Asparsgine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, ?=Threonine, V=Valine, 
W-Tryptophar., Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion; 








P VLQOINTVDVLTTMG AI PAG FR PSTLSQLLEEGNQFRANY FLQ 
PELYiPSQliAFRnLMWDATEGTIRSRPSRISLILTLWSCKMIPLP 
GMS1QVLSRHVRLCLFDGNKVLSNIHTVRATW0PKKPKTWTFSP 

ELSCGWVFLKLFDASGVPIPARTYELFLNGGTPYEKGIEVDPSI 
SRRAHGSVFYQIMTMRRQPQLLVKLRSLNRRSRNVLSLLPETLI 
GNMCSIHLLIFYR0ILGDVLLKDRMSLQSTDL1SHPMLATFPM1, 
LEQPDVMDALRS SWAGQES \TiiKRSBKR \ PKS FLKVPR FLLVYH 
\GC^LPI*L/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GAL0A1LSPDGVHEP FDLS EOTYD FLGEMR KNAV 


5791 


3 


1636 


LRVAEFAGTS R/ IGAGLIQPLHRAPARDHGLLRGGAAPALSVSH 
GK/GKQL/AMSSOGSDDEQIKRENIRSLTMSGHVGFESLPDOLV 
NRS I OQGFCFKI LCVGETG IGKSTLI DTLFNTNFEDYES SHFCP 
NVKLKAQTYELQESNVOLKLTI VNTVG FGDQ1 NKEES Y Q? I VD Y 
1DA0FEAYL0EELKIKRSLFTVHDSRIKVCLYFISPTGHSLKTL 
DLLTMKNLDS KVY1 1 P VI AKADTVSKTELQKFXIKLMSELVSNG 
VQ I YQ F PTDDDX I AKVNAAMNGQLP FAWGbMU b VKVbWiVMVKA 
RQYPWGWQVE^EKHCDFVKLREMLICTNMEDLREQTKTRllYEL 
YRRCKLEEMGFTDVGPENKPVSVQETYEAKRHEFHGBRQRKEEE 
MKQMFVQRVKEXEAILKEAERELQAXFEHLKRLHQEERMKLEEK 
RRLLEEEI1AFSKKKATSEIFHSQSFLATGSNLRKDKDRKNSQF 
FVKQKVPEHRRSSSQANFI KKKLEVCFDFAV: CFITS 1 FGEQPQ 
LL1 FMEKYFQVOG0YISQSE 


5752 


2263 


65> 


AAAA PS PAW WCG VF WY WHTCWVMYGI VYTR PCSGDASCI QPY 
LARRPKLQL\RKS FTTTRSHLGAENN I DLVLNVEDFDVESKFER 
TVNVS VPKKTRNNGTL YA Y I FLHHAG VLPWH DG KQVHLVS PLTT 
YMV PKP EE I NLLTG E S DTQQI EAJDKKPTS ALDE ? VSH WRPR LAL 
NVMADNFVFDGS SLPADVHRYKKM1 QLGKTVHYLPILFI DQLSN 
R VKDLMVI NR S TT ELPLT VS YD KVS LGRLRF W I H MQDAV Y S LQQ 
FGFS E KDADE VKG I F VDTNIjY FLAJUT F F VAA FK LLFDFIjAFKND 
1SFV7KKKKSM1GMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 
AGVGAAIELWKVKKALKMTIFWRGLMPEFQFGTYSESERKTEEY 
DTQAKK YLS YLL Y PLCVGGAVYSLLNI XYKSWY S WLINS FVHGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 


2263 


653 


AAAAFS PAWWCGVF W YWHTCW VMYGI VYTR P CSGDASC 1 OP Y 
LARRPKLOIi\RHSFTTTRSHLGAENNIDLVLN\rEDFDVESKFER 
TVNVSVPKKTRNNGTLYAY I FLKHAGVLPWHDG KQVHLVS PLTT 
YMVPKP EE INLLTG ESDTQQI EADKK PTS ALDE P VS HWRPR LAL 
NVMADN FVFDGS S LPADVHR YMKM1 QLGKTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFW1HMQDAVYSLQQ 

'Cr , l?CTrV7MvrYC > \7V/™'T JTimTNTT VPT 2lT TPP\7ZiZiFVf T .T.PDPT lAFTOjn 
rurc£>fkUnUCiVlVui.r vu liNLii r linu 1 r r vtv\r nijut ur unr ivivu 

I SFWK KKKSMI GM S T KAVLWRC FST W I F LFLLD EQTSLL VLVP 

AGVGAA I ELWKVK KALKMT I FWR GLMPE FQFGT Y SES ER KTE E Y 

DTQAMKYLS YLLY PLCVGGAVYS LLNI KYKSWYS WLINSFVNG V 

YAFGFLFMLPOLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 

riTMPTSHRLACFRDDVVFLVYLYQRWLYPVDKRRVNEFGESYE 

EKATRAPHTD 


5794 


1 


5016 


MGPRLSVWLLLLPAALLLHEEHSRAAAKGGCAGSGCGKCDCHGV 
KGQKGERGLPGLQGVIGFFGMQGPEGFQGPPGQKGDTGEPGLPG 
TKGTRGPPGASGYPGNPGLPGI PGQDGPPGPP31 PGCNGTKGER 
GPLGFPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQSPVGPPGFTGPPGPPGPPGPPGEKGQM 
GLSFQGPKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYFGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 
GDOGPPG I PGQPG F I GE I GEKGQ KGESCLICD I DGYRG PPGPQG 
PPGEIGPPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to fir ft 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, 3=Isoleucine, X^Lysine, 
L= Leucine, M^Methicnine , N^Asparagine, 
F= Proline, 0=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 
PKGS PG S VG L KGE RG P PGG VG F PG S RG DTG P PG P PG YG PAG PIG 
DKGOAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPGF 
PGPQGDRGFPGTPGR\PGL\PGEKGAVG\QPGIGFPGPPG?KGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGOKGEPGVGLPGLKGb 
PG LPG I PGTPG E KGS I G V P G VPG EHG AI GPPGLQGI RG E PG P PG 
LPGSVGSPGVPGIGPPGARGPFGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGI PG FPG 
SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\!IGFPGSSGPRGD 
PSLKGDKGDVGLPGKPGSMDKVYMG5MKGOKGDQGEKGQ3GPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGI PGPQGSPGLPGDKGAKGE KGQ 
AGPPGIGIPGLRGEKGDOGIAGFPGSPGEKGEKGSIGIPGMPGS 
PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGJPGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 
G HAT EGP KGDRGPQGQPG L PGLPG PMG ? PGLPG I DGVKGD KGNP 
GWPGAPGV PG P KGDPGFQGMPGI GGS PG ITGS KGDMG PPGVPGF 
QGPKGLPGLQGIKGDOGDOGVPGAKGLPGPPGPPGPYDI 1 KGEP 
GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFL 
VTRHSQT I DDPOCP SGTK1 LYHGYSLLY VQGNERAHGQDLGTAG 
SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 
I TGFNI R P FI SRCAVCEA PAMVMAVKSQTIQI PPCPSGWSS LW I 
GYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPF3ECHGRGTCN 
YYANAYS FWLATI ERSEKFKKPTPSTLKAGELRTHVSRCOVCMR 
RT 


5*795 


1192 


63 


STRS PTVE Yl SAH PHI LFMLLKGYEAPQ I ALRCG I MLRECI KHE 
PLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEON YDTI FEDYEKLLQS2NYVTKRQSLKLLGELI LDRHN 
FAIMTKY3SKPENLKLMMNLLRDKSPKIQFEAFHVFKVFVASPH 
KTQP1VEI LLKNQPKLIEFLSSFQKERTDDEQFADEKNYLI KQI 
RDLKKTAP * RALRDSKR 


5796 


2 


107£ 


GRVGWELWCMYISPPKDWWDAGDPSLPIRTPAMIGCSFVVNRKF 
FGE I GLLD PGMDVYGGEN I E LG I KVWLCGGSMEVLPCSRVAH I E 
R KKK P YNSN I G FYTKRNALR VAEVWMDD YKS HVy I AWNLPLEKP 
GIDI GDVSE R RALRKSL KCKNFQ W YLDHVYPEMRR YNN TVAYGE 
hum KAK D VCLDQG PLENHT A I LYPCHGWG PQLAR YTKEG FLHL 
GALGTTTLLPDTRCLVDNSKSRLPQLLDCDKVXSSLYKRWNF1Q 
NGAI MNKGTGRCLE VENRGLAG IDLILRSCTGQR WTI KNS I K * R 
EGAGAI^PGPQDMAAPPNlWTSCPGGETARGRQVIiDGPPRASFG 
QHRDPG 


5797 


2 


09: 


PRVROKTLVDVTLENSNIKDOIRNLQQTYEASMDKLREKQRQLE 
VAOVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLOEEQR 
KHSAEKEALLEETNSFLKAIEEANKKMQAAEISLEEKDQRIGEL 
DRLIERMEKERHQLOLQLLEHETEMSGELTDSDKERYQQLEEAS 
AS LRER I RP LNDMVH CQQ KKVKQM VEEI ES LKKKLQQKQLL I IjQ 
LLEK I S FLEGENNELQSRLDYLTETQAKTEVETREIGVGCDI.LP 
SQTGRTRE I VMPSRK YTPYTRVLELTMKKTLT 


5798 


644 


115 


KILGSRWKSMSNQEKOPYYEEQARLSKIHLEKYPNYKYKPRPKR 
TCIVDGKKLRIGEYKOLMRSRRQEMRQFFTVGQOPQIP3TTGTG 
WYPGAI TMATTTPS PQMTS DCS STSAfiPEPS LPVIQSTYGMKT 
DGGSLAGNEM1NGEDEMEMYDDYEDDPXSDYSSENEAPEAVSAN 


5799 


2679 


1435 


LLSTY I KFINLFPETKAT IOGVLRAGSQLRNADVELQQRAVE YL 
TLSSVASTOVTiATVLEEMPPFPERESSILAKIiKRKKGPGAGSAli 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PAS AGAGNLLVDVFDGPAAQPSLGPTPEEAFLS PG PEDIGPP I P 
EADELLNKFVCKNNGVLFENOLLQIGVKSEFRQNLGRMYLFYGN 
KTSVOFONFSPTVVHPGDL0TQLAVQTKRVAAQVDGGAQVOOVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFOPTEM 
AAODFFQRWKQLSLPpQEAOKIFKANHPMDAEVTKAKLliGFGSA 
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SEQ 
ID 
NO: 

l 

1 

1 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firGt 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alan-ne, C=Cysteine, D=Aspartic Acic, E= 
Glutamic Acid, F=Phenyl alanine, G=Glyc:ne, 
H=Hist ddine, Idsoleucine, K»Lysine, 
L^Leucdnc, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S^Serine, T=Threonirie, V=Valine, 
W=Tryptcphan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






LLDNVD PN P ENF VGAG 1 1QTKALQVGCLLRLEPNA0A0MVRLTL 
RTSXEPVSRHLCEL1AQQF 


5800 


2679 1435 

i 

I 

! 


LLSTyiKFINLFPETKATIOGVLRAGSQLRNADVELQQRAVEYL 
TIiSSVASTDVLAT\ r LEEMPPFPERES£IIiAIOKRKKGPGAGSAli 
DDGRRDPSSNDINGGMSPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 
EADEll-NKFVCKNNGVLFENOLLQIGVKSBFRQNLGRMYLFYGN 
KTSVQFONFSprVVHPGDLQTQLAVOTKRVAAQVDGGAQVQOVlj 
NIECLRDFLTPPLLSVRFRYGGAPOALTLKLPVTINKFFQPTEW 
AAQDFFOH WKQLSLPQOEAQK I FKANH PMDAEVTXAXLLGFGSA 
LLDNVDPNPENFVGAG I IOTKALQVGCLLRLEPNAQAQMYRLTL 
RTSXEPVSRHLCELLAQQF 


5803 


3 


1413 


FPRLYHLIPDGEITSIXINRVDPSESLSIRLVGGSETPL^/HIII 
QHI YRDGVI ARDGRLLPGDI 1 LXVNGMDI SNVPHNYAVRLLRQP 
CQVLWLTWREOXFR SRNNGQAPDAYR PRDDSFHVI LN KS S PEE 
QLG t KLVR KVDEPG VF I FNVLDGG VAYRHGQLEENDRVl»AI^3GH 
DLRYGS F E S AAHL 1 QAS E RR VHL WS RQVRQR S P DI FQE AGWN S 
NGSWSPGPGERSNTPKPLHPTITCHEXWNIQKDPGESLGMTVA 
GGASHREWDLPI YVIS VEPGG VI SRDGRI KTGDI L1*NVDG VELT 
EVSRSEAVALLKRTSSSIVLXALEVKEYEPQEDCSSPAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
GYEEYNCNKPFFI KS I VEGTPAYNDGR 1RCGDILLAVNGRSTSG 
M IHAC LAR LLXELKGR I T LT I VS W PGT FL 


5802 


3 


290 


CFS LY02 MERI MDLPTLLRHAFREMFSVGGLFWMFR I Rl 1 LCLM 
GAFFYL1SPLDFVPEALFGILGFLDDFFVIFLLLIYISIMYREV 
ITQRLTF 


5803 


2234 


1299 


E AQFGTT A E I Y AYR EE QD FG I K I V XVKAI GRQR FKVLELRTQSD 
G1Q0AKV0ILP2CVLPSTMSAVQLESLNXCQIFPSKPVSREDQC 
SYXWW0KY0XRXFHCANLTSWPRWLYSLYDAETLMDRIXKOLRE 
KDENLKDDSLPSNFIDFSYRVAACLPIDDVLRIOIJuKIGSAIQR 
LRCELD1MNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHGrVHETLTVYKACNLNLIGKPSTEHSWFPGYAWTVAOCXICA 
SHIGWKFTATKXDXS PQKFWGLTRSALLPT1 PDTEDEI S PDKVI 
LCL ' 


5804 


2 


1707 


EMEXQROEEORXRTEEERKRR1E0DMLEXRKI0RSLAXRAEQIE 
DINNTGTESASEEGDDSLLITWPVKSYKTSGRMXKNFEDLEKE 
REEKER1 KYEEDKR I RYEEQRPSLXEAKCLSLVMDDEI ESEAKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEXRAFEE 
ARRQMVNEDEENQDTAXIFKGYRPGKLKLSFEEMERQRREDEKR 
XAEEEARRR I EEEKKAFAEARRNMWDDDSPEKY XTISOEFLTP 
GKLEINFEELLXQKKEEEKRRTEEERXHKLEMEKQEFEOLRQEM 
GEEEEENETFGLSREYEELIKLXRSGSIQAKNLKSKFEKIGQLS 
EKEIQXKIEEERARRRAI DLEI KEREAENFHEEDDVDVRPARKS 
EAPFTHKVNMXARFEOMAXAREEEEQRRIEEQXLLRMQPEQREI 
DAALQXKREEEE2EEGS I MNGSTAEDEEQTRSGAPWFXXPLXNT 
SWDSEPVRFTVXVTGEPXPEITWWFEGEILQDGEDYQY2ERGE 
TYCLYLPETFPEDGGEYMCKAVNNXGSAASTCILTIESKN 


5805 


3 


776 


Y I SDTLGOVY XSK 1 RWWI EENGGNGNISVDDLIALLDLAEHASS 
AFKESQQOS EDREYE VXERLYPXS XRR YDTYNIAG YQGE 1 EVGL 
YT1QIL0L1PFFDNXNELSXRYMVNFVSGSSDIPGDPNNEYKLA 
LKNYIPYLTKLXFSLXXSFDFFDEYFVLLKPRNNIXQNEEAXTR 
RKVAGY FKKYVDI FCLLEESQNNTGLGSKFSBPLQVERCRRNLV 
ALXADKFSGLLEYLI XSQEDAI STMXCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRT AN LY S LH SW LG I TT VFLFACQR FLG FAVF LLP W 
ASMWLR S LLXP I HVFFGAAI LSLS I AS VI SG INE XLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLASSWKRF 


5807 


2267 


1302 


RFS XXTFRR PMAVD I QP ACLGLY CG KT LLFXNGS TE I YGE CG VC 
PRGQRTN AQKY CQ PCTES PE LYDWLYLGFMAMLP LVLHWF F I EW 
YSG XXS S S ALFQH I T ALFECS MAAI I TLLVS DP VG VLY IRS CRV 
LMLSDW YTMLYNPS PDYVTTVHCTHEAVYPLYTI VF I YYAFCLV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signai. peptide 
(A=AIanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ?=Phenylalanine , G=Glycine, 
KsKistidine, 2=Isoleucine, K= Lysine, 
L=L*eucine, M= Methionine, N«Aeparacine , 
P=Prcline, 0=Glutamine, R>=Arainine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, YaTyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion; 








LHMLLR PLLVKK 3 ACG LGKSDR FKS I YAALY FFP1 LTV LQ AVGG 
GLLYYAFPY 1 1 LVLS LVTLAVYMS AS E 1 EKC Y DLL VR KKRLIVL 
FSHWLLHAYGI 3 SISRVDKLEQDLPLJLALVPTPALFYLFTAXFT 
EPS R I LSEGANGK 


5808 




433 


SLPDSGWEYLSNGGVADNHKDFGELRYKECLMNFSCNGKNGSS 
EGR I THGFOLKSAYENNLMP YTNYTFDFKG V IDY I FYS KTHMNV 
LGVLGPLDPQWLVENWITGCPHPHIPSDHFSLLTQLELKPPLLP 
LVNGVKLPNRR 


5809 




2422 


ILVPGFQG I LHPGVY CALQSQKQAQELV7vDl D2CEVSGLCRKGG 
RCVNTHGS FECYCMDGYLPRNGPEPFHPTTDATSCTE3 DCGTPP 
EVPDGYIIGNyTSSLGSQVRYACREGFFSVPEDTVSSCTGLGTW 
ESPKLHC0EINCGNPPEMRHA3LVGNHSSRLGGVARYVC0EGFE 
SPGGKITSVCTEKGTWRESTLTCTEILTKIWDVSLFNDTCVRWQ 
INSRRINPKISYVISIKGQRLDPMESVREETWLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFOTAEVDLLEDDGSFNI 
SIFNETCLKLNPJ?SRKVGSEKMYQFTVLGOHWYLANFSHATSFN 
FTTREOVPWCLDLY PTTDYTVNVTLLRS PKRHS VQI TI ATPPA 
VKQT3 SNISGFNETCLRWRSIKTADMEEMYLFHIWGQRWYOKEF 
AQEMTFN3SSSSRDPEVCLDLRPGTNYNVSLRALSSELPVVISL 
TTOITEPPLPEVEFFTVHRGPLPRLRLRXAKEKNGPISSyOVIiV 
LPLALOST FS CDSEGASS FFSNASDADGYVAAELLAKDVPDDAM 
EI P3 GDRLY YGEYYNAPLKRGSDYCI ILR 3 TS EWNKVRRHSCAV 
WAQVKDS S LMLLQMAGVGLGSLAWI II/TFLSFS AV 


5810 


2 


1641 


KV FGTHKDH E VSTLDT AI S AV KVQLAEFLEN LQE KS LR I E AFVS 
B I E S FFNT I EENCS KNE KRLEEQNEEMMK KVLAQYDE KAOS FEE 
VKKKKME FLHEQMVHFLQSMDTAKDTLET 3 VREAEELDEAVFLT 
SFEE3 NERLLSAMESTASLEKMPAAFSLFEKYDDSSARSDOMLK ■ 
QVAVP0PPRLEPOEFNSATSTTIAVYWSMNKEDVIDSFQVYCME 
EPQDDOEVNELVEEYRLTVKESYCIFEDLEPDRCYQVWVKiAVNF 
TGCSLPSERAI FRTAPSTPVI RAEDCTVCWKTAT IRWRPTTPEA 
TETYTLEYCROHSPEGEGLRSFSGIKGLOLKWI^QPNDNYPFYV 
RAIMA FGTS EQSEAAL ISTRGTRPLLLRETAHPALHISSSGTVI 
SFG ERR RLTE I P S VLG EELP S CGQH YWETT VTD C P AYRLG I CSS 
SAVOAGALGOGETSWYMHCSEPORYTFFYSG 3 VS DVHVTERPAR 
VGI.LLDYNNORLIFINAESEQLLFIIRHRFKEGVHPAFALEKPG 
KCTLHLG IB P PDS VRKK 


5B11 


1916 


851 


AAALADPLP EDKVJSAEKRR PLKS SLGYEITFS LLNPDP KSHDVY 
WDI EGA\mRYVQPFLNALGAAGNFSVDSQ3 LYYAMLGWPRFDS 
ASSSYYUDMHSLPHVINPVESRLGSSAASLYPX'L.NFLLYVPELA 
HSPLY I QD KDG AP VATNAFH5 PR WQG I MVYNVDS KTYNAS VLP V 
RVE\W^VWEVFLAQLRLLFGIAQPQLPFKCLLSGPTSEGLMT 
WELDRLLWARS VENLATATTTLTSLAQLLGKI SN I VI KDDVASE 
VYKAVAAVOKSAEF.LASGHLASAFVASQEAVTSSELAFFDPSLL 
HLLYFPDDOKFAI Y I PLFLPMAVPI LLSLVK1 FLETRKS WR KPE 
KTD 


5812 


52 04 


2744 


GGRQRCORGRSCGAREBEVEPGTARPPPAASAMDASLEK3ADPT 
LAEMGKNLKEAVKMLEDSQRRTEEENGKKLI SGDI PGPLOGSGQ 
MVS I bQhVQN LMHGDEDEE PQS PR IQNIGECGHMALLGKS LGA 
YISTLDKEKLRKLTTRILSDTTLWLCRIFRYENGCAYFHEEERE 
GLAK3 CRLA3HSRYEDFWDGFNVLYNKKPVIYLSAAARPGLGQ 
YLCNQLGLPFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDD3ERG 
RLPLLLVANAGTAAVGHTDKIGRLKELCEQYG I WLHVEGVNLAT 
UOjGYVSSSVIAAAKCDSMTMTPGPWI^LPAVPAVTLYKKDDPA 
LTLVAGLTSN KPTDKLRALPLWLSLQYLGLDGFVER I KHACQLS 
ORLOES LKKVNY I K I LVEDELS S PWVFR FF0EL PGSDP VFKAV 
PVPKMTPSGVGRERHSCDALNRWLGEQLKQLVFASGLTVKDLEA 
EGTCLRFSPLMTAAVLGTRGEDVD0LVAC3 ESKLPVLCCTLQLR 
EEFKQE V E ATAGLLY VDDPNWSGIGWRYEKANDDKSSLKS Y PQ 
GENIHAGLLKKLKELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVETI AATAREI EDNSRLLENKTEWRKGI OEAQVELQKAS 
EERLLEEGVLROIPWGSVLNWFSPVQALOKGRTFNLTAGSLES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleoti de 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Kistidine, I=Isoleucine, K=Lysine , 
h- Leucine, M= Methionine, N=Asparagine, 
P= Proline, Q=Glutamine , R=Arginine, 
S=£erine, T=Threonine, V=Valine, 
W~Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TE ? I YV YKAQGAGVTLP PTPSGS RTKQRLPGQKPFKRS LRGS DA 
LSETSSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDOTEAFQKGVPHPEDDHSOVEGPESLR 


5B13 


2936 


699 


H R DG VSGS LER PLTDRS RTGA F AQQRG KMATAGGG SGAD PG S RG 
LLRLLS FCVLtiAGLCRGNS VERKI Y I PLN KTAPCVRLLNATKQI 
GCQSSISGDTGVrHWEKEEDLQWVLTDGPNPPYMVLLESKHFr 
RDLMEKLKGRTSR3 AGLAVSLTK PS P ASG FS PSVQCPNDGFGVY 
SNS YGPEFAHCRE I QWNSLGNGLAYEDFSFPI FLLEDENETKVI 
KOCYODHNLSQNGSAPTFPLCANQLFSHMAWLSFSTAT\CMRRS 
S 1 OS TFS INP K I VCDPLSDYNWISMLKP 1 NTTGTLKPDDRWVA 
ATRLDSRS FFIWV\ APGAES AVAS FVTQLAAAEALQKAPDVTTL 
PRNVMFVFFQGETFDYIGSSRMVYDHEKGKFPVOLENVDSFVEL 
GQVALRTSLELWMHTDPVSQKNESVRNQVEDIiLATLEKSGAGVP 
AV1LRRPNQSQPLPPSSLORFLRARNISGWLADHSGAFHNKYY 
QS I Y DTAENINVS Y PEWLEP LKE / ETWNFG* QDTAKALADVATV 
LGRALYELAGGTNFSDTVQADFQTVTRLLYG\FlilKANNSWFQS 
IL0GRDLRSYLG*RGLFQH\YIAV\SSPTNT1YV/VLQYALANL 
TGTWNLTREQCODPSKVPSENKDLVEYSWVOGPLHSNETDRIiP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 
IASKELELITLTVGFGILIFSL1VTYCINAKADVLFIAPRSPGA 
VSV 


5814 


8500 


432 


ALK CR PRR VLAI LVGP VQPDRMAE EGA VAVCVR VRPLNSREESL 
GETAQVYWKTHNNVIYPVDGSKSFNFDRVLHGNETPKNVYEA\I 
AAP3 I DSAIQGYNGTIFA\YGQT\ASGKTYTMMGSEDKLGVIPQ 
GOFHGHFSQKI + EVFLDREFLI.RVSYMEIYKBTITDLLCGTQKM 
KPL3 1 RSDVNRNVYVADLTE E WYTSEMALKWI TKGEKSRHYGE 
TKKN0RSSRSHT1FRMILESRBKGEPSNCEGSVKVSHLNLVDUA 
GSERAAQTGAAGVRLKEGCNINRSLFILGQVIKKLSDGQVGGFI 
NYRDS KLTR I LQNSL3GNPKTRI 1 CT3 TPVSFDETLTALQFAST 
AKY MKNTP YVNEVS TDEALL KRYR KE I MDLKKQLEE VS LETRAQ 
AMEKDQLAQLLEEKDLLQKVQNEK1ENLTRMLVTSSSLTLQQ2L 
KAKR KRR VTWCLGK INKMXNSNYADQFNI PTN1 TTKTHKLS INL 
LRED DESVCSESDVFSNTLDTLSE1 EWNPATKLLNQENIESELN 
SLRADYDNLVLDYEQLRTEKEEMELKLKEKNDLDEFEALERKTK 
KDOEMQLIHEISNLKNLVKHREVYNQDLENELSSKVELLREKED 
QIKKLQEYIDSQKLENIKWDLSYSLESIEDPKOMKQTLFDAETV 
ALEAKRESAFLRSENLELKEKMKE1ATTYKQMENDIOLYQSQLE 
AKK KMQ VDLEKELQS AFNE I TKLTS L 1 DG X VPKDLLCNLELEGK 
I TD1 OKE LNKEVEENEALREE VI LLS ELKS LPSEVERLRKEI QD 
KSEELHI I TSFJCDKLFSEVVKKESRVOGLLEEIGKTKDDLATTQ 
SNY K STDQSFQN FKTLHMDFE QKY KMVLEENERMNQE I VNLS KE 
AQKRDSSLG ALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 
NRDSPLQTVEREKTLITEKLQQTLEEVKTLTOEKDDIjKQLOESL 
QIERDQLKSDIHDTVNMNIDT0E0LRNALESLKQH0ETir3TLKS 
KI SEE VSRWLHMEENTGETXDEFOOKMVGI DKXQDLEAKNTQTL 
TADVKDNEI IEQQRKI FSLIOEKNEIiQQMLESVI AEKEQLKTDL 
KEN 1 EMT I ENQEELR LLGDEL KKQOE I VAOEKNHAI KKEGELSR 
TCDRIAEVEEKLKEKSOQLOEKQQQLLNVQEEMSEMQKKINEIE 
NLKNELKNKELTLEHMETERLEIAQKLNENYEEVKSITKERKVL 
KELOKSFETERDHLRGYIREIEATGLQTKEELKIAHIHLKEHQE 
TIDELRRSVSEKTAQIINTQDLEKSHTKLQEEIPVIiHEEOELLP 
I^XK\'SETQETMNELELLTEOSTTKDSTTLARIEMERLRLNEKF 
OESOEEIKSLTK^DNLKTIKEALEVKKDOLKEHIRETLAKIQE 
SQ S XOEOS LN MKEKDM ETTK I VS EMEQFKP KDS ALLR I E 1 EMLG 
LSKkLQFSHDEMKSVAKEKDDLQRLOEVLOSESDQLKENIKEIV 
AIOILETEEELKVAHCCLKEQEETIKELRVNliSEKETEISTIQKQ 
LEA I NTDKLQNKIQEI YEKEEQLN I KQ ISEVQEXVNELKQFKEHR 
KAKDSALQSIESKMLELTNRLOESQEEIQIMIKEKEEMKRVOEA 
LQIE RDOLKENTKEI VAKMKESQEKEYQFLKMTAVNETQEKMCE 
I EHLKEOFETQKLNLENI ETENI RLTQ1LHENLEEWRSVTKERD 
DLRSVEETLKVERDOLKENLRETITRDLEKOEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucieotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue oi 
amino acid 
sequence 


Amino acid segment containing signal peptide ! 

<A=Alanine, C=Cysteine, D=Aspartic Acid, E= ' 

Glutamic Acid, F=Phenylalanine, G=G2ycine, ; 

H=Histidine, I=Iso3 eucine, K=bysine, 1 

. i 

L=beucine, M=Meth.ionme, N=Asparagine , j 
P=Proline, Q=Glutamine, R=Arginine, j 
S=Serine, T=Threcnine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X= Unknown , *=Stop j 
Codon, /^possible nucleotide deletion, . 
\=possible nucleotide insertion) 








CETIDKLRGIVSEKTNEISKHOKDLEHSNDALKAQDLKIOEELR 
IAHMHLKEQQETIDKLRGI VSEKTDKLSNMQKDLENSNAKLQEK 
IQELKAKEHQLI TLKKDVN ETOXKVSEMEQLKKQ I KDOS LTLS K 
LEIENLNLAQKLHENLHEMKSVMKERDNLRRVEETLKLERDQLK 
ESLQETKARDLEIQ0ELKTARMLSKEKKETVDKLREKISEKT1Q 
ISDIQKDLDXSKDELQKKIOELQKKELQLLRVKEDVNMSHKKIN 
EMEQbKKQFEPNY LCKCEMDN FQbTKKbHESbEE I R I VAXERDE 
LRRIKESLKMERDOFIATbREMIARDRONHQVKPEKRLLSDGQQ 
HLMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
R I MKKLK Y VLS Y VTK I KEEOHE C I NKFEMDFI DEVE KQKELL I K 
IQHLQ0DCDVPSRELRDLKLN0NMDLH1EEILKDFSESEFPSIK 
TE FQQVLS N RKEMTQFLEEWLNTRFD I EKLKNGIQKENDR I CQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKNEKLF 
KN YQTLKTS IoASGAQVNPTTCDN iCN PHVTSRATQLTTE KI RELE 
NSbHEAKESAMHKESKI 1 KMOKEbEVTNDI IAKLQAKVHESNKC 
LEKTKETI QVL0DKVA1.GAKP Y KEE 1 EDLKMKLGKI DbEKMKNA 
KEFEKE3SATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 
DPQPSNKPLTCGGGSGIVQNTKALILKSEHIRLEKEISKLKQQN 
EOLIKQKNELLSNNQHLSNEVKTWKERTLKREAHKQVTCENSPK 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 
LPSPHPVRYFDKSSLGLCPEVONAGAESVDSQP\GPWARLFQGK 
DVP\ECKTQ 


5315 


23 


146C 


SEbVMWTVONRESbGLLSFPVMITMVCCAKSTNEPSKMSYVKET 
VDRLLKGYDI RLRPDFGGP PVDVGMR I DVAS IDMVSEVNMDYTL 
TMYFOOSWKDKRbSYSGIPLNLTLDNRVADQLWVPDTYFLNDKK 
SFVHGVTVKIJRMIRLHPDGTVLYGLRITTTAACMMDLRRYPLDE 
CNCTLEIESYGYTTDDIEFY'.'JNGGEGAVTGVNKIEbPQFSIVDY 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
S WVS FW I NYDAS AAR VALG I TTVLTMTTI S THLRETLPKI P Y V K 
AI DI YLMGCFVFVFLALLE YAFVNY I F FGKGPOKKGAS KQDQSA 
NE KN KLEMN KVQ VDAHGN ILLSTLEIRNETSGS E VLTS VS D P KA 
TMYS YDSAS IQYRKPLSSRE\A*GRAPDRKGVPSKGRI RRRAS\ 
QLKVKIPDLTDVNS1DKWSRMFFPITFSLFNWYWLYYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 
TVYHERQRLELCAVHALNNVLOQQLFSQEAADEICKRLAPDSRL ! 
NPHRSLI/5TGNYDVNVIMAAbOGU3LiAAWWDRRRPLSObAL?Q ; 
VLGLILNLPSPVSLGLLSLPLRRRKLRWPCARL/VTVSYYNLDS . : 
K\ LRAP EGPGGLRTE\ * G P FbAAAbAOGb CEVLL WTKE VE EKG 
SWLRTD 


5817 


651 


116 


RLFRGPGAMRGRSCRGCSGGREPSGGALPKRHCPC*PPSFPAAD 
VMSNTTV PNA PQANSDSM VGY VLGPFFLI TLVG VWAWM YVQK 
X KRVDRLRHH LLPMYS YDPAE E LHE AEQE LLSDMGDP KW\QAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLLPbLSP 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHbCr L 


5828 


3 


3916 


OAbRDKb WI FLVOS F YA VRHTES WKLMSTDDQQKIQAAAFDKGD 
DRRLGKKP I FSSSQQRKQVSDSGDIKI KS WRGNNKKECWSYLST 
N KKMKS DG bG AS GHS S STNRN SIN KTLKQDD VKEKDGTKI AS K I 
TKEbKTGGKNVSGKPKTVTKSKTENGDKARbENMSPRQWERSA 
TAAAAATGQ KNLLNGKG VRNQEG QI SGAR P KVbTGNbNVQAKAK 
PbKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKbSFCDSPGOMMKJCSVDSVKNSTVAIKSRPVSRVT 
NGTSNKKS I HEQDTNVNNSVbKKVSGKGCSEPVPQAI LXKRGTS 
NGCTAAQORTKSTPSNbTKTQGSQGESPNSVKSSVSSRQSDENV 
AKbDHNTTTEKQAP KRKMVKQ VH TAbPKVNAKI VAM P KNbNQS K 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSKFH 
DVRDNNNKDSVSE0KPHKPL3NbASEISDAEALQSSCRP\DPQK 
PbNDQEKEKbAbECQNI SKbDKSbKHEbES KQI CbDKSETKFPN 
KKETDDCDAANICCHSVGSDmT^SKFYSTTAbKYMVSNPNENSL 
NSNPVCDbDSrSAGQIHblSDRENQVGRKDTNXQSSIKCVEDVS 
bCNPERTNGTbNSAQEDKKSKVPVEGbTI PS KbSDES AMDEDKH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ot 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteme, DsAspartic Acid, E«= 
Glutamic Acid, F=PhenylaIanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, NsAsparagine, 
P*=Proline, C=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKKSPKNMETSESPESHETPETPFVGK 
WNLSTGVLHQRESPESDTGSATTSGDE3 KPRSEDYDAGGSQDDD 
GSNDRGISKCGTMLCHDFLGRSSSDTSTPEELKIYDSN1.RIEVK 
MKKQSSNDLFQVKSTSDDE1PRKRPEIKSRSAIVHSRERENIPR 
GS VQFAQE I DQVSS S ADETED ER S EAEN^AENFS I SNPAPQQFQ 
GIINLAFEDATENECREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDG1SHEHHGRTCYSRFSRESEDNILE 
CKQNKGNS VCKNESTVLDLS SIDSS RKNKQS VSATEKKNTI DVL 
SSRSRQLLREDKKVNNGSNVENDIQOREXFLDSDVKSOBRPCHL 
DLHQaEPNSDIPKNSSTKSLDSFRSQVL-PQEGPVKESHSTTTEK 
ANI ALS AGD I DDCDTLAQTRM YDHR ? S KTLS P I YEMDVI EAFEQ 
KVESETHVTDMDF* DDQHFAKQDWTLiLKQLLSEQDSNLDVTNSV 
P EDLS LAQ YL INOTLLLARDS S KPOG 1 TH I DTLNRWS ELTSPLD 
SShS I TMAS FS S EDCS PQGE WTI LELETQH 


5819 


1 


5557 


AAAGLLGALHLVMTLWAAARAEKEAFVQS ES 1 1 EVLRFDDGGL 
LQTE TTLGLS S YOQK S I S L YRGN CR ? I R F E P PMLDFHEQ P VGM P 
KME KVYLHN P S S E * T I TLVS I FATTS H FHAS F FQNRK I L PGGNT 

s fdvs /vflarwgmventlfintsnhgvfty\qvfgvgvpnpy 

RLRPFLGARVTVNSSFSPIINIHNFHSEPLQVVEMYSSGGDLHL 
ELPTGQOGGTRKLK'EIPPYETKGVMRASFSSREADNHTAFIRIK 
TNASDSTEFI1LPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 
LNLHLLNSGTKDVP I TS VRPTPQ\NDAI TVHFKP I TLKAS \ESK 
YTKVAS IS FDAS KA K KPSQFSGKI TVKAKEKS YS KLE IP YQAEV 
LD G YLGFDHAATL FH I RDS PADPVER P I YLTNTFS FAILIHDVL 
LPEEAKTMFKVHiaFS KPVLI LPNESGY I FTLLFMPSTSSMHIDN 
N I LL I TN AS KFHL P VRVYTGFLD Y FVL PPKIEERFID FGVLSAT 
EASNILFAIINSNPIEIiAIKSWHIIGDG\LS3ELVAVDRGNRTT 
I3SSLPECEKSSSSDQSSVTLASGYF\AVFRVKLTAKKL\EGIH 
DGAIQITTDYEIL7IPVK\AVIAVGSLTCSPKHVVLPPSFPGKI 
VHQSLNIMNSFSQKVKIQQIRSLSEDVRFYYKRLRGNKEDLEPG 
KKSKIANIYFDPGLOCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 
WDADWDLHQSLFKG WTG 1 KENSGHRLS A I FEVNTDLQKN I I SKI 
TAELSWPS ILSS PR HLKFPLTNTNCSS \ EEE ITLENP/ SQDVPV 
YVQFI PLALYSNPS VFVDKLVS RFNLSKVAKIDLRTLEFQVFRN 
SAHPLQSSTGFMEG \ LS PHLILNLI LKPGEKKS VKVK\FTP VHN 
RTVSSLIIVRNNLT\/MDAVMVOGCGTTENLRVAGKLPGPGSSLR 
FKITEAIiLKDCTDSLKLREPNFTLKRTFKVFjgTGQLQIHIETIE 
ISGYSCEGYGFKVVT\3COEFTLSANASRDI IILFTPDFTASRVIR 
ELKFI TTSGSEFVFI LNASLFYHMLATCAEALPRPNWELALYI I 
I SG I MS ALFLLVI GTA\YLEAOGI WE P\ FRRRLS \ FEASNPP FD 
VGRPFDLRRIVG2 £S EGNLNTLSCDPGHSRGFCGAGGSSSRPSA 
GSHKQ* GPSGHPHS SHSNRNS ADVDDVRAYNSGRTSSMTSAQAA 
SSQPANKTRPLVLDSNTGAOGHSAGRKSKGAKQSQHGSQHHAKS 
PLEOHPQPPLPPPVPOPQEPOPERLSPAPLAHPSHPERASSARH 
SSEDSDITSLIEAMDXDFDHHDSPALEVFTEQPPSPLPKSKGKG 
KPLCRKVKPPKKQEEXEKKGKGKPQEDELKDSLADDDSSSTTTE 
TSNPDTEPLLKEDTEKOKGKO/^PEKKESEMSQVKQKSKKLLNI 
KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK . 
SRNAQKTKGTSKLVDNRPPALAKFLPKSOELGNTSSSEGEKDSP 
P P EWDS VP VHKPG S S TDSL YKL S LQTLNAD I FLKQRQTS PTPAS 
PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSXHKLTKAA 
SIjPGKNGNPTFAAVTAGYDKSPGGNGFAKVSSNKTGFSSSLGIS 
HAPVDSDG S DSSGLW S PVSNPS S PDFTPLNS FSAFGNS FNLTGE 
VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 
SGSPTHTATS VLGNTSGLWSTTPFS SS I W S SNLSSALP FTTPAN 
TLAS IGLMGTENS PA PHAPSTS S PADDLGQTYNPWR 1 WSPTIGR 
RSSDPWSNSHFPHEN 


5820 


310 


1270 


RVSLSGPVSLGVLLCARSSTMGKRDNRVAYMNPIAMARSRGPIQ 
SSGPTIQ\ VI * IDOGLPGKK* KSN * KRKR K / DS KALAE FEEKMN 
E]WKKELEKHREKL!.SGSESSSKKRQRKKKEKKKSW+\DSSSS\ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKKRSHKSSESSMSETES 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor respond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, £=» 
Glutamic Acid, F= Phenylalanine , G^Glycine, 
K^Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Axginine, 
S=Serine, T=Threo:iine , V=Valine, 
KsTryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, j 
\-possible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKEKDIKGLSXKRKMYSEDKPLSSESLS 
ESEYIEEVRAKKKKSSEEREKATEKTKKKKKHKKHSKKKKKKAA 
SSSPDSP*K*EKSGFPYKESAMSEEI STV KTTTYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5B21 


179 


915 


KWRNQSVJRWPKPG'rNWMI.»SCSVCWRRVrvgTGSVWMRKLGKHPOT [ 
PT / 1 KD CS I AATG KR P S AR F PHQRRK KRP. EMDDGLAEGG PQRS N ; 
TYVI KLFDRSVDLACFSENTPLYPI CRAWMRNS P5VRERECSP5 ! 
SPLPPLPEDEEG\5EVTNSKSR*CVQACFPTHTPGGQPKNACR\ J 
SRIPSPLAALRMC^TP*RWSPFEPEPSPSTLIYRNMQRWKRIRQ | 
RWKEASHRNQLRYSESMKILREMYERO 


5822 


464 


4379 


OTLKEM PI VMARDLiEETASSSEDEEVI SQEDHPCI MWTGGCRRI 
PVLVFHADAILTKDNN1RVIGERYHLSYKIVRTDSRLVRSILTA 
HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 
LTRKDRLYKNI IRMQHTHGFKAFHILPQTFLLPAEYAEFCNSYS 
KDRGPWIVKPVASSRGRG\VYLINNPNQISLEENILVSRYINNP 
LLIDDFKFDVRLYVLVTSYDPLVIYLYEEGLARFATVRYDOGAK 
NIRNQFMHLTNYSVNKKSGDYVSCDDPEVEDYGNXWSMSAMLRY 
LKQEGRDTTALMAK VEDLI I XTI I SAEIiAI ATACKTFV PHR S SC 
FELYGFDVLIDSTLKPWLLEVNLSPSLACDAPLDLKIKASMISD 
MFTWG FVCQDPAQRASTRP 1 YPTFES S RRNPFOKPQRCRPLS A 
SDAEM KNL VG S AR E KG PG KLGGS VLG LSMEE I KVLRR V KE ENDR 
RGGFI R I FPTS ETWE I YGS YLEHKTS MNYMLATRLFQDRMTAEK5 
APELKI ♦ SLKSKAKLHAALYERKLLSLEVRKRRRRSSRIiRAMRP 
XYPVITQPAEMNVKTETESEEEEEVALDNEDEEOEASQEESAGF 
LRENQ AKY TPSLTALVENTP KENSMK VRE WNN KGGHCCKLE TOE 
LEPKFNLMQI LQDNGNLS KMQAR I AFS A Y LQHVQI \RLMKDSGG 
QTFSASWAAKEDEOMELWRFLKRASNNLQHSLRMVLPSRRLAL 
LERTRILAHQLGDFIIVYNKETEQKAEKKSKKKVEEEEEDGVNM 
ENFQEFIROASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN 
NYSDSGAKGDHPET I MEEVKIKPPKQQQTTEIKSDKLSRFTTSA 
E KEAKLVY S N£ S S G PTATLQKT PNTHLS S VTT S DLS PGP CHH S S 
LSQI PS AI PSMPHQPTILLNTVSASAS PCLHPGAONI PSPTGLP 
RCRSGSHT I GP FS S FQSAAH I YSQKLS R PS SAKAGS CYLNKHHS 
G I AKTQKEG EDAS L Y S KR YNQS M VTAELQRLAEKOAARO YS P S S 
H INLliTQQVTNLNLATGI I NRSSASA P PTLRPI ISP SGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
LPNSNLVJTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VPKPPPNHEQVLRRATSQKASKGSSAEGOLNGLQSSLNPAAFVP 
ITS STDPAHTKI MNHKHTEKQPVHHS WVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKED1LLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWS PLAGE KFVEVYKEAHLLALH I ESSSRNQAAQAAXP 
EDPRSCX5 VERFIOE S K?\ Kl NLFEKEKEMKKS PTS LKRETY YLS 
DS PLLGPPVGEPRLLAS S PALPS SGAQARLTRAPG PPHSAHALP 
RESCTAHAASOAATORKPGTKLLLPRAASVRGRGI PGAAEKPXK 
E I PAS PS R TKI PAE K ESH RDVLP D KP APGA WVPAAGSHLGQG K 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQP VAKAKS SEFAS I PAN* LPGLCPNI S KS \GRMGPAMLRPA 

L\ PAGPVG \ asswqakrvdvselaaeqltapp \sas ptqpqtpe 

GGG\QWLNSSCAWSESSOLNKTRSIRRRDSCLNSKTKVr4PTPTN 
OFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEFRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GS PPSRVPQALNFS P EES DSTFSKS TATE VARE EAKPGGDAAPS 
EALLVDIKLEPLAVT PDAASQPLI DLPL I DFCDTPEAHVAVGSE 
SRPLI DLMTNTPDMNKNVAKPSPWGQLI DLSS PL I Ql<$ PSADK 
! ENVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDE P S ACRAGDVNMDDPKKED I LLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TS ESPFAWS PLAGEK FVEVYKEAHLLALH 1 ESSSRNQAAQAAKP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino ccid segment containing signal peptide 
(A=Aianine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=*Glycine, 
H=Histidine, I=Iscleucine , K=Lysine, 
L=Leucine, M~Methionine, N=Aaparagine , 
P=Prcline, Q=Glutomine, R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptcphan, Y^Tyrosine, X=Unknov/n, *=Stop 
Codon, /=possible nucleotide deletion, 
\=pcssible nucleotide insertion) 




i 

j 

i 
j 

j 

j 
I 


EDPRSQG VER FI QES K F \ K 3 N LF EKE KEM KKS PTS LKRETY YLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHAL? 
RESCTAHAASQAATQR K PGTKLLLPRAAS VRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVF\NKLGL.KKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS I PAW* LPGLCPNISKS\GRMGPAMLRPA 
L\PAGFVG\ASSWOAKRVDVSELAAEQLTAPP\SASPTQPQTPE 
GGG\OWLNSSCAWSESSQLNKTRSlRRRDSCLNSKTKVMFTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGIiPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKJ'JSAMRTEPTRESNRKTDSR\1jVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDJKLEPLAVTPDAASQPLIDLPLIDFCDTPEAHVAVGSE 
SR PLI DLMTNTPDMNKH VAKPS PWGQLI DLS SPL3QLS PEADK 
ENVDSPLLKF 


5825 


2 


4210 


FLQIESASPAPFSSGFLAAHPHSPGGSLATKGRSRLSAPGMLHL 
SAAP PAF P PEVTATARPCLC SVGRRGEGGXMAAAGALE RSFVEL 
SGAER EU PRH FREFTVCS I GTANAVAGAVK YS ESAGGFY YVESG 
KLFSVTRNRFIHWKTSGDTLELMEESLDINLLNNAIRLKFQNCS 
VLPGGVYVSETQNRVI Z LMLTNQTVHRLLLPHPSRMYRSELWD 
SQM0SIFTDIGKVDFTDPCNYQL1PAVPG3SPNSTASTAWLSSD 
GEAbrALPCASGGIFVLKLPPYDlPGMVSWELKQSSVMORLLT 
GWMPTAIRGDQSPSDRPI^IAVHCVEHDAFIFALCODHKLRMWS 
VKEOMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 
G3F\MHAPKRGQFC1FQLVSTESNRYSLDHISSLFTSQETLIDF 
ALTSTD J WALWHDAENOTWKY3NFEHNVAGQWNPVFMQPLPEE 
E 3 VI R DDQDPREM YLQS LFTPGQFTNEALCKALQI FCRG TERNL 
DLSVJSELKKEVTbAVENELOGSVTEYEFSOEEFRNLQQEFWCKF 
YACCLQYOEALSHPLALHLNPHTNMVCLLKKGYLSFLI PS SLVD 
HLYLLPYENLLTEDETT1SDDVDIARDVICLIKCLRLIEESVTV 
DMS V 3 M EMS CYNLQS PE KAA EQ I LEDM IT I DVENVMED ICS KLQ 
E1RNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 
GSNTAGY3VCRGVHKIASTRFLICRDLLIL00LLMRLGDAVIWG 
ItSQLFOAOQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LCHLSVLELTDSGAIJ-1ANRFVSSPQTIVELFFQEVARKHI I SHL 
FSOPKAPLSQTGLNWPEMITAITSYLLQIiLWPSNPGCLFLECLM 
GNCQYVOI^DYI01iLHPWCOVNVGSCRFMLGRCYL.VTGEGOKAL 
ECFCOAASEVGKEEFLDRliIRSEDGEIVSTPRLQYYDKVLRLLD 
V3GLPELV 1 QLATSAITEASDDW\ KS QATL S RTCI FKHHL \ DLG 
\HNSQAYGSL* PQI PDSSRQLDCLROLVWLCERSQLQDLVEFS 
YVNLHNE WGI lESRARAVPLMTHjfYYELLYAFIIl YRHNYRKAG 
TVMFEYGMRLGREVRTLRGLEKQGNCYLAALNCLRLIRPEYAWI 
VQPVSGAVYDRPGASPKRNHDGECTAAPTNRQIEILELEDLEKE 
CSLAR 1 R LTLAQHD? S AVAVAG SSSAEEMVTLLVQAGLFDTA I S 
LC0TFKLPLTPVFEGLAFKCIKU3FGGEAAQAEAWAWLAANQLS 
SVITTKESSATBEAWRLLSTYLERYKVQNNLYHHCVINKLLSHG 
VPLPNWLINSYKKVDAAELLRLYLNYDLLDLTPYQVIRICGC 


5826 


3 


871 


KS<?LLRDHSAPPPKPCTSVGAMGC*PRQ/SPKEQQRQLKKOKNR 
AAAQRS R0KHTDKADALHQQHESLEKDNLALRKE3 QSLQAELAW 
WSRTLmrHERLCPMDC^CSAPGIjJ^CWDQAEGIjIjGPGPOGOHG 
CREQLELFOTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 
AWAEPPVOLSPSPLLFASHTGSSbQGSSSKLSALQPSLTAQTA 
PPQPLELEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPS PHPLLAFPLLSSAQVHF 


5827 


194 


2287 


GMGSENSAJLKSYTLREPPFTLPSGLAVYPAVIXJDGKFASVFVYK 
RENEDKVKKAAKVP* * HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLE VAlrETLSSAEVCAG I YDILLALI FLHDRGH LTHNN VCJLr 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQSIRDPASIPP 
EEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEOVSAEVLS 
SFQQTLHSTLLNP I PKWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKSEEEKTEFFKFLLDRVS CLSEELIASRLVPLLLNQLVFAEP 



381 



BNSDOCID: <WO 01$3312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ci 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethioni ne, N=Asparacine , 
P=ProIine, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=pos3ibie nucleotide deletion, 
\=possibie nucleotide insertion) 








VAV \ KS F L P YLLG P K KDHAOGET P CLLS PAL FQS R V I P V LLQL F 
EVHEEHVRMVLLSH J EAYVGAbSLREQLKKV\ IL\PQVLLG\LR 
D \ TS DS 1 VA I T LHS LAV LVS LLG PE VWGG ERTK 1 F KRTAP \ S F 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNI\OIWP\REP\CDDVKSQCTTLDV 
EESS WDDCE PS SLDTKVNPGGGI TATKPVTSGEQKP I PALLS LT 
EESMPWKSSLPQKISLVQRGDDADQIEPPXVSSQERPLKVPSEL 
GLGEEFT10VKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 
EMV PKKDDVS P VMQFS S KFAAAE I TEGEAEG WEEEGELNWEDNN 
W 


5826 


2 


257 


AR EGG S LG/*V AACG E LS Y SCD FCP ARPHTS WLTR F V KME FQAW 
MAVGGGSRMTDLTS S I PKPLLPVGNKFLI WY PLNLLERVGFEEV 
IWTTRDVOKALCAEFKMKMKPDIVCIPDDADMGTADSLRYIYP 
KLKTDVLVLSCDL3 TDVALHE WDLFRAYDAS LAMLMRKGQDS I 
EPVPGOKGKKKAVEORDFIGVDSTGKRLLFMANEADLDEELVIK 
GS I LOKH PR I R FHTGLVD AHLYCL K KY 3 VD FLMENG \ S I TS 1 RS 
EL\IPYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
SFY - * KEANYTGTGAPY\D\ACWI 


5829 


260 


1255 


PDGRLIVSCSEDKTIK1WDTTNKQCVNNFSDSVGFAJNFVDFNPS 
GTCIAS AGS DQTVKVWDVRVNKLLQHYQVHSGGVNCI S FHPSGN 
YLI TASSDGTLKI LDLLKGRLI YTLQGHTG PV FT VS FSKGGELF 
ASGGADTQVLLWRTKFDELHCKGLTKRNLKRLHFDSPPHLLDIY 
PRTPHPKEEKVETVEDFFLHLLRLIQSIjR*SICRSLLPLLWISF 
LL3LPQ^0KPWGLCOTRVKRPVDIS*TLP*CH0NVCQOPRKRK 
QKT * VTS P V KVK / VS I PLAVTDALEH IMEQLNVLTQT VS I LEQR 
LTLTEDKLKDCLENOOKLFSAVQOKS 


5830 


4496 


3139 


ggkmaapeerdltqeqtexllqfodltgiesmdocrktleohnw 
nieaav0dklneqegvpsvfnpppsrplqvn7adhr1ysywsr 
pqprgllgwgy ylimlpfrfty y t i ldi frfalrfi rpdprsrv 
tdpvgdi vs fmhs feekygrahpvfyqgtysoalndakrelrfl 
lvy lhgddh ods defcrnt lcapevi slintrm lfw acs 7nkpe 
gyrvsoalrent y pflami mlkdrre * pv\ vgrlegli \qpddl 
inoltfimdanotylvserlereernqtqvlrqqodeaylaslr 
ad0ekerkkreererkrrkkeevo0qklaeerrronloeekerk 
leclpfepspddpesvki ifklpndsrverrfhfsosltvihdf 

LFSLKESP\EKFQIEA\KF?RR\VLPCIPSEE\W?NPPTLQE\A 
GLS HTE VLF VQ DLTDE 


5831 


11 


2897 


FCSKDKCCLYLPDSINRSKSCrAKPGAHSODRKAVMDSERQVKD 
TDDIESPKRSIRDSGYIDCWDSERSDSLSPPRHGRDDSFDSLDS 
FGS RSROTPS PD WLRGS SDGRGSDSESDLPHR KL PDVKKDDMS 
ARRTSKGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSFAGLGKKAL0DYGPRT\PVS\DDAESTSMFDMRC3E 
EAAVOPHSRAR0EQLOLJNN0LREEDDKW0DDLARWKSRKRSVS 
ODLIKKEEERKKMEKLLAGEDGTSERRKSIKTYREIVQEKERRE 
RELKEAYKKARSOEEAEGILOQY1ERFTISEAVLERLEMPKILE 
RSHSTEPNIkSSFLNDPNPMKYLRQQSLPPPKFTATVETTIARAS 

vldtsksagsgs ps ktvtpkavpmltpxpysopknsqdvlktfk 
vdgkvsvngetvhreeekerecptvapahsltksomfegvarvh 
gs plelkqdngs i e ini kkpnsvpqelaattektepnsqedknd 
ggksrkgnielassepohftttvtrcsptvafvefpsspolknd 
vseekdqkkpenemsgkvelvlsqkvvkpkspefeatltfpfld 
kmpeanqlhlpnlnsqvdspssekspvttpfkfwawdpeeerrr 
Oekwooeqsrllqeryq\keqdk\lkee\wekaqkeveeeerry 

YEEEP * 1 1 \ EDPVVPFTVSSSSADOLSTSSSKTEGSGTMNKI DL 

gncodekqdrrwkksfqgddsdlllktresdrleekgsltegal 
ahsgnf vskg vke dhqldte ag apkcgtn pqlaqdp sqnqqtsn 
pthssedvkpktlpldks3nhqxespserrksisgkklcsscgl 
pligkgaamiifjtl^lyfhiqcfrcg\lckgqlgdavsgtdvrir 
ngllncnpcymrsrsagopttl 


5832 


2454 


829 


PGRRFRHGSCAFQKQC1MLHICQYFLQGECKFGTSCKRSHOFSN 
SENLEKLEKLGMSSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
MO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predictec end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid. E= 
Glucamic Acid, • F* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valint, 
WoTryptopban, Y=Tyroeine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPOGTSERKJDS3GSVSPNTLSQEEGDQ1CLYHIRKSCSF0DKCH 
RVHFHLPYKWQb'LDKGKWEDLDNMELIEEAYCNPKIERILCSES 
ASTFHSHCLNFNAMTYGATQARRLSTASSVTKPPHF1LTTDWIW 
yWSDEPGSWQEYGRQGTVHPVTTVSSSDVEKAYLAY/WyTGVR 
PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN+XGLPOTQIR\AP 
QDVTTMOTCNTKFFGPKSIPDYWDSSALPDPGFOKITLSSSSEE 
YQKVWNLFNRTLPFYFV0KIERVQNLALWEVY0WQKGQMOKONG 
GKAVDEROLFHGTSAI FVDAI CQQNFDWRVCGVHGTS YG KGSYF 
AR DAA YSHH Y S XS DTQTH TM FLAR VLVGE FVR GNAS F VR P PAKE 
GWSWAFYDSCVNS VSDPS I FV1 FEKHQVYPE YVIQYTTS SXPSV 
TPSILLALGSLFSSRQ 


5833 


17C 


3289 


SILCLLSPCWQFGKPWSILSSRSRHSPCTKKGVTEGMRKHLHT 
ROGHK* VHVE I S KALWVYRDDY Fl RHS I S VS AVI VRAW1 THKYR 
GRDV^WKWEENIjLHAVAKNYTLLOTIPPFERPFKDHCVCLEWNM 
GYIWNLRANRIPOCPLENDWALLGFPYASSGENTGIVKKFPRF 
RNRELE ATRR ORMDY P V FTVSLWLY bLHY CK AN LCG I L.Y F VDSN 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
DI S FNGGQ I WTTS I GQDLKS YHNQTI SFREDFH YNDTAG YFI I 
(^SRYVAGlEGFFGPhKYYRLRSLHPAQlFNPLLEKQLPiEQIKL 
YYERCAEVOE I VS VYASAAKHGGERQEACHLHNS YLDLQRRYGR 
PSMCRAFPWE KELKDKH PSLFQALLEMDLLTVPRNQNES VSE IG 
GK1 FE XA VKR LSSJ DGLHQI S£ I VP FLTDSS CCGYHKAS YYLAV 
FYETGLNVPRDQIiOGMLYSLVGGQGSERLSSMNLGYKHYQGIDN 
YPLDWELS YAYYSNI ATKTPLDQHTLQGDQAYVETI Rl »KDDE I L 
KVQTKEDGDVFMWLKHEATRGNAAAQQPJLAOMLFWGQOGVAKNP 
EAAIEWYAKGALETEDPALIYDYAIVLFKGOGVKKNRRLALELM 
KKAAS KGLHQAVNGLGWYYHKFKKNYA\KAAKYWLKA\EE \MGN 
PDASYNLGVLHLDGIFPGVPGRNQTLAGEYFHKAAOGGHMEGTL 
WCS LY Y I TGN LE T FPRD P E KAVVWAKH VAEKNGYLGHV I R KGLN 
AYLEGS WHEALL Y YVLAAETGIEVSQTNLAH I CEER PDLARR YL 
GVNCVWRYYNFSVFQIDAPSFAYLKMGDLYYYGHQNOSQDLE^S 
VQM YAQAALDGDSQG FFNLALL I E EGTI I PH H I LDFLE I DS TLH 
SNN1 S I LQELYERCWSHSNEESFS PCSLAWLYLHLRLLWGAILH 
SAL 1 Y FLGT FLLS I LIAWTVQYFQS VSASDPPPRPSQAS PDTAT 
STASPAVTPAADASDQDQPTVTNNPEPRG 


" 5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPFRTPKPPRT/QECG 
SAAPGP I PGOSSS * VPLRLEQI QQ KADCPLS LEIoALK PR KAAQV 
TLEDALSNVDLLEELPLPDQQP C I E P P PS S LLYQPN FNTN FEDR 
NAFVTG I ARYI EQATVHS SMNEMLEEGQEYAVMLYTWRSCSRAI 
PQVKCN EQ PNR VE I YEKTVE VLEP EVT KLMNFMYFQRN AI ERFC 
GEVRRLCHAERRKDFVS EAYLITLGKF INMFAVLDELKNMKCS V 
KNDHS AYKRAAQFLR KMADPQS IQESQNLSKFLANHNK1 TQSLQ 
QQLB V I S GYE ELLADI VNLCVDY YENRM YLT F S EKHMLLK VMG F 
GLY LM DGS VS N I YKLDAKKR INLS KID KY FKQLQVVPLFGDMQ I 
ELARYI KTSAH YEENKSR WTCTSSGSS PQYNI CEQMI 01 REDHM 
RFISELARYSKSEVVTGSGROEAQKTDA£YRKLFDLAI<?GI>OLL 
SQWSAHVMETVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVEVIAMI KGLQVLMGRMESVFNHAI RKTVYAALQDFSQ 
VTLMEPLRQAI KKKKNVIQS VLQAI RKTVCDWETGHE PFNDPAL 
RGEKDPKSG* Dl KVPRRAVGPSSTQLYI4VRTMLESLI ADKSGSK 
XTLRSSLEGPT I LD I EKFHRES FF YTIIL INFSETLQQCCDLSQL 
WFREFFLELTKGRRIQFPIEMSMPWILTDHILETKEASMMEYVL 
YSLDLYNDSAHYALTRFNKQFLYDEIEAEVNLCFDOFVYKLADQ 
IFAYYKVMAGSLLLDKRLRSECKNQGATIHLPPSNRYETLLKQR 
HVQLLGRS I DLNRLI TQRVS AAMYKSLELA1 GR PESEDLTS I VE 
LDGLLEINRMTHKLLSRYLTLDGFDAMFREANHN\'SAPYGRITL 
HVFWELNYDFLPNYCYNGSTNRFVRTVLPFSOEFORDKQPNAQP 
OYLHGSKALNLAYSS I YGS YRNFVGPPHFQVI CRLLGYOGI AW 
MEELLKWKSLLQGT I LQYVKTLMEVMPKI CRLPRHEYGSPG IL 
EFFHKOLKDIVEYAELKTVCFQNLREVGNAI LFCLLI EQSLSLE 
EVCDLliHAAPFONILPRVHVKEGERLDAKMKRLESKYAPLHLVP 



383 



BNS0OCID: <WO 0153312A1_I_> 



WO 01/53332 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predictec 
beginning 
nucleotide 
1 oc3 t i on 
corresponding 
to first 
amino acic 
residue o: 
amino acic 
sequence 


Predictec enc 
nucleotide 
1 oca tier, 
corre sponding 
to'first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Kistidine, I=Isoleucine , K=Lysine, 
L=Leucine, M-Methionine, K-Aspsragine, 
P=Proline, Q=Glutamine, R=Argininc, 
S=Serine, T=Threonine, V^Valine, 
'^Tryptophan, Y=Tyrosine, X^Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIERLGTPOC'IAIAREGDLLTKERLCCGL>SMFEVILTRIRSFLD 
DP 1 WRGPLFS NG VMHVDE CVEFHR LKS AMQFVYCI P VSTHE FTV 
EOCFGDGLHWAGCMIIVLLGQQRRFAVLDFCyHLLKVQKKDGKD 
EI 1KNVPLKKMVERIRKFQILNDE 3 371 LDKY LKSGDGEGT P VE 
HVRCFQPPIKQSliASS 


5835 




1904 


SGN 3 RMAQGSHQ I D FQVLHDLRQK F PE V PEVWSRCMLQKNNNL 
DACCAVLS QESTRYLYGEGDLNFSDDSGI SGLRNHMTSLKLDLQ 
S3N3YHHGREGSRMNGSRTLTHS1SDGOLCX3GOSNSELFQQEPO 
TAPAQVPQGFNVFGMSS S SGASNSAPHLG FHLGSKGTSSLSQQT 
PRFNPIMVTLAPNIQTGRNTPTSLH3HGVFPPVLNSP0GKSIYT 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGFWTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
S0SSAHSQYNI0NISTGPRKNQIE3KLEPP0RNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNCPTVYIAASFPNTDELMSRSOPKVYISA 
NAATGDE0VMRNQPrLP3STNSGASAASRjMMSGQVSMGPAFIHH 
KP PKSRAI GNWSATSPRVWTQPNTN E YTFKITVSPNKP PAVS P 
G WS PTFE LTN LLNH PDK Y VETEN I H H LTDPT LAHVDR I S ETR K 
LSMGSDDAAYTQDI *R1SNS WLGM VAiiACNS SALGGQDGR 1 1 *A 
0EFETSWGN3WRLRliYRRF*NYAGMVAHTCSPSYSVD*ALLVHQ 
KARMERLQRELB I QKKKLDKLKSEVN E M ENNLTRRRLKRSNS I S 
01 PSLEEMOOLRSCNRQLOIDIDCLT>CE IDLFOARGPHFNPSAI 
HN FYDNIG FVG PVP P KPKDQRSI IKTP KTQDTEDDEGAQWNCTA 
CTFbNHPALI RCEQCEMPRHF 


5836 


363 


2303 


FH3TMCGICCS\T*FSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK 
SDVNYQCL FS AH VLHJjRGVLTTQPV EDE RGNVFLWNGE I FSGIK 
VEAEENDTQILFNYLSSCKNESEILSLFSEVOGPWSFIYYQASS 
HYLWFGRDFFGRRSLLWHFSNLGKSFCLSSVGTQTSGLANOWQE 
VPAS\DFSELILSLLSFPDALFYNC3 LGNI FLGRILLKECML1A* 
V:<FQ0TYQKLYOR*QMKFNCILKNLLFL* I *CCHKLHWRLI AVI 
FPMCKLQERYFKSFLLMYT*KEVIQOFJ DVbSVAVKKRVLCLPR 
DENLTANEVLKTCDRKANVAILFSGG3 DSMVIATLADRHIPLDE 
PTm T ,NVJX F T A F F KTMPTT FNR KN KCE I PSEEFS KDVAA 
AAADSPNKHVSVPDRITGRAGLKELOAVSPSRIWNFVEINVSME 
ELQKLR RTR I CHL I R PLDTVLDDS IGCA.VW FASRG 3 GWLVAQEG 
V KS Y QSNAK WLTG I GADEQLAGYSRK RVRFQSHGJjEGLNKE I M 
MELGRISSRNLGRDDRV3GDHGKEARFPFLDENWSFLNSLPIW 
EKANLTLPRG2 GEKLLLRLAAVEbGLTASALLPKRAMQFGSR I A 
KMEK 3 NE KASDKCGRLQ I MS LENLS 1 E K F.TKL 


S837 


4792 


903 


NGNAVAOAPVTNCCYLATGSKDQTIR 1 WSCSRGRGVMILKLPFL 
KRRGGG 3 DPTVKERLWLTLKWPSNQPTOLVSS CFGGELLQWDLT 
OSWRPKYTLFSASSEGQNKSRIVFNLCPL0TEDDKQLLLSTSMD 
RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGSLAIGVGDG 
MIRVWNTLS I KNKYDVKNFWQGVXSKVTALCWHPTKEGCIiAFGT 
DDGKVGLYDTYSNKPPQISSTYHKKTVTTLAWGPPVPPMSIjGGE 
GDRPSIAliYSCGGEGIVLGHNPWKLSGEAFDINKblRDTNSIKY 
KLPVHTEISWKADGKIMALGNEI)GSIE3FQ\IPNLKLICTIQQH 
HKLVNTISWHHE\HGSPAQKLSYL\MPSGSQOCSPFTCHNLKNC 
P* KAAPESPSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH* WEGL 
VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPL 
DPDCIYSG\ADDFCVHKWLTSMQDHSRPPQGKKSIELEKKRLSQ 
PKAKPKKKKKPTLRTPVKLESIDGNEEESMKENSGPVENGVSDQ 
EGEEOAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
ILLKKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKEELHQDCL 
VLATAKHS RELNEDVSADVEERFHLGLFTDRATLYRMIDI EGKG 
HLENGHPELFHQLMLWKGDLKGVLOrAAXRGELTDNLVAr^APAA 
G YHVWLWAVEAFAKQLCFQDQYVKAASK LLS IHKVYE AVELLKS 
NHFYREAIAIAKARLRPEDPVLKDLYL5WGTVLERDGHYAVAAK 
CYliGATCAYD AAKVIiAKKGDAAS LRTAA E LAAI VGEDELS ASLA 
LRCAOELLl^NWVGAQ^AiQl^ESUXr-QRLVFCLLELLSRHLE 
EKQLSEGKSSSSYHTWNTGTEGPFVERVTAWKSIFSLDTPEQY 
OEAFQKLQNIKYPSATNNTFAKQLLLH3CKDLTLAVLSQ0MASW 



384 



BNSDOCID: <WO 0153312A1J_> 



WO 0J/533J2 PCT/USOO/34263 



SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue o; 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D=Aspartic Acid, 
Glutamic Acid, F=PhenylaIanine, G=Glycine, 
H=Kistidine, 1= 1 soleucine, K=Lysine, 
L- Leucine , ^Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V^Valine, 
W=Tryptophan, Y-?yrosine, X*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRSyDSGSPTJMOEVySAFLPDGCDHLRDKLGD 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGERM 
LS TFKELFSEKKASLONSORT VAE VOETLABMI RQHQKSQhCKS 
TANGPDKNBPEVEAEQPLfCSSQSQCKEEKNEPLSLPELiTKRLTE 
ANORMAKFPESIKAWPFPD\n J ECCLVLLL3RSHFPGCLA0SMQQ 
QAQELLQKYGNTKTYRRHCOTFCK 


5838 


110 


96 


KTMPHLLVTFRDVAIDPSQEEKECLDPAQRDLYRDVMLENYSNL 
1SLDLESSCVTKKL£FEKEIYEMES\PSGRIVIGNVST3TFQYNG 
LGDNMECKGNLEGOVSKSEGLYMCVKITCEEKATESHSTSSTFH 
RII/HYQGKIVKCKECRQGFSYLSCLIQHEEMHNI*KCSEVNKH 
RNTFSKKPSYI * HQ\KFRLGEKPYECMECGKAFGRTSDLIQHQK 
IHTNEKPYQCN ACGKAF I RGSQLTF.HQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHORtHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKPYECKECGKAFILGSHLTYHORVHTGEKPYICKECGKAFLCA 
SQLNEHQRIH7GEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKECGKAFISNSNL30H0RIHTGEKPYKCKECGKAFICGK0LS 
EHQRIHTGEKPFECKECGKAFIRVAYLTCKBKIHGEKHYECKEC 
GKTF VRATQLTY HOR 3 HTG EKPYKC KECDKAF/ HLWLT I LSEHQ 
RIKRG EKP YECKQCGR / LF I RGSHL /NEHLRTHTGEKP YECKEC 
GRAFS RGSEHTLHQR 1 HTGSKFYTCVQCGKDFRCPSQLTQHTRL 
HN*EYSSHK3CMHSIALASLDFAHL0EKNPEN 


5839 


1 


2425 


GRPFPRPPRAI>PRLPLRGRRQEGRWTVDFEECLKD\SPRFRAAL 
EEVEG DVAELEL KL\DKLVKLC I A\MI DTGKAFCv/ANKQFMNGI 
RD\ IAQNS \NNDA\ WETKFAPS FLDS LQEM INFHTIL/L* PNS 
EIK*GHSFQNFVKEDLRKFKDAKKQFENSQ*KRKXIALVKNAPV 
PSR PASLEL * KP PNI LTATRKCFRH I ALDYVLQI NVLCSKRRSE 
I LKSMLS FMYAHLAFFHQG YDLFS ELG PYMKDLGAQLDRLVGDA 
AKEKREMEGKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RASNAFKTWKRRWFSIO^QVVYOKKFKDNPTVVVEDLRLCTVK 
HCEDI ERRFCFE WSPTKSCMLQADSEKLRQAWIKAVQTSI \AT 
AYREKDDESEKLDKK5SPSTGSLDSGNESKEKLLKGESALQRVQ 
CI PGNASCCDCGLADFRWAS I NLG I TLCI ECSG I HRSLGVH FS K 
VRSLTLDTWEPELLKLMCEU5NDVINRVYEANVEKMGIKKPQPG 
OROEKEAYIRAKYVERKFVDKIFL*SLSPP\EQOKK\FVSKSSE 
EKRLSISKFGP\GDOVRASAOSSVRSNDSGIOQSSDDGRBSLPS 
TVSANSLYEPEGERQDSSMFLDSJOil^PGUJLYRASYEKNLPXM 
AEALAHG ADVNVJANS EEN KATPL 1 QAVLGGSLVTCEFLLQNGAN 
VNGRDVQGRGPLHHATVLGHTGOVCLFLKRGANQHATDEEGKDP 
LS I AVEAANAD I VTLLRLARMNEEMRE SEGLYGQPGDETYQDI F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLKLPRQHLTTLWQ1 S S PRWRS PQRAFMSALSKTQTQSAPALQ 
GLSSLLQSVTGNPVPASEAASQSTSASPAWrrVYTIKGRKLPSS 
AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGLPSTAFKLP 
SNTKG FTATHNTS P AA F P TE VT I CQSSEVSK PKL \ESESTS PS L 
\EMK 1 HNFLKGNPG FSVA *NLKKPNPAGSLGSSAPS ESHFSDFQ 
RGPT S TS I DNI DGTP VR DERSGTPTQDEMMDKPTSS S VDTMS LL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLKDSSOEKFYPDTSFQEDEDYRDFEYSGP 
P PS AMMNLQKK PAKS 1 LKSS KLSDTTEYQ PILS S Y S HRAQEFGV 
KSAF P PS VRALLDS S ENCDRLS SS PGLFGAFSVRGNE PGSDRS P 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPMPVPHRS 
LFSP0NT1AAPTGH?PTSGVEKV1J\STISTTSTIEFKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQEEHY 
R I ETRVS S SCLDL PDS TEE KG AP I ETLG YHS ASNRRMSGEP 1 QT 
VES IRVPGKGNRGHGREASRVGWFDLS7SGSSFDNG PSS AS ELA 
SLGGGGSGGLTGFKTAPYKERAPOFOESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFOREPVGPSSAPPVPPKDHGGIFSRDAPTKLPS 
VDLSNPFTKEAAIAHAAPPPPPGEHSGIPFPTFPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 



385 



BNSOOCID: <WO 0153312A1_L> 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Fredicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acic, r=Phenylalanine, G=Glycine, 
H=Hist idine , I=:lscleucine, K=Lysine, 
IisLeucine, ^Methionine, N=7isparagine, 
P=?roline, 0=Glutamine, R«=Arginine, 

O — OCX. J. lie / 1 — X ill UIl-LIJt , vs voi_nC/ 

W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPSHSLEHliGPFHGGGGGGGSNSSSGPPLGPSHRDTISRSGII 
LRSPRFDFRPREFFLSRDPFHSLKRPRPPFARGPPFFAPKRPFF 
PPRY 


5841 


1908 


762 


GLRLFLVI*TVWPKKKPSWLSRTEFSKRLLCRTLWCOSGWSSRSY 
TRSMLKMTTS INRKS RTSTKSTRTSARPGLTATVS 1GLSDS PTW 
KHC WM 1 AK S C 5 G c KGG H W AF RQ VG V Y LL» PG R VGCVS S RVSPSFP 
GLX3LDSGLARRGSAVSA1ASGLVEEPMLGPPFHPTPRFKAVSAK 
SKEDLVSQGFTEFTI EDFHNTFMDLI EQVEKQTS VADLLAS FND 
OSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCONQE 
\VEPMCKESDHI H 1 1 ALAQGLQRVHPGWEYMGPRPRAATTNPHI 
FP+GLPSPKVYLLYRPG\HYDILYKIGLGSSPLGCPGCPLLARA 
LGHCYRGFSWVKVvSYFTPFFLSHDPPPMFY 


5842 


307 


1918 


QEPTADFKLRSTCGCGREMTCPDKPGQLINWFICSLCVPRVRKL 
WSSRRPRTRRNL1 .bGTACAI YLG FLVSQVGRAS LQHGQAAE KGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNWYlTIiRSK 
RSKPANIRGTVXPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 
KLML*HLGTLRECTWLRLESDPGGWCGVRE/WRAGGFDFLOPSS 
RESNIRIYSESAPSWLSKDDIRRMRLLADSAYAGLRPVSSRSGA 
RLLVLEGGAPGAVLR CGPS PCGLLKOPLDMSE VFAFHLDR I LGL 
NRTLPSVSRKAEF10DGRPCPI3LWDASLSSASNDTHSSVKLTW 
GTYQQLLKOKCTCNGRVPKPESGCTEIHHHEWSKMALPDFLLQI 
YNRbDTNCCGFR PRKEDACVQNGLRPKCDDQGSAALAHl I QRKH 
DPRHLVPIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVLKSQH 
LRQKLIjOSLFLDKGYVCESQGGROGIEKLIDVIEHRAKILITY in ! 
AHGVKVLPMNE j 


5843 


500 


1453 


GTARLVTCWVLHGO*VKXPAWEPGWWL*Q*RCRPKGWGLGAGM 
R3SRMSQPPQCLRRAQSSCCHFMVKLLDDGTFMIPGEKVAHTSL 
DALVTFHQOKP I E PRRELLTQPCRQKDPANVDYEDLFLYSNAVA 
EEAACPVSAPEEASPKPVLCHOSKERKPSAEM/RQNNHCGSHFL 
LPPKIPSWRDPPE7LEEPQNAPRERPEGPAAAKKPPRHCBLWT 
LGCP E I HGDLR P WPRKRQPRS LRGSHLGGQR LHG S LCGH 1 SQKP 
LTAPGTKRQKGPHOSGREVGQLH+GDPRGQELAPNGSESPILPG 
VCARAPGLGRA 


£844 


202 


2471 


FDSAVLSS I NVMAVLPG PLQLLGVLLTI SLSSI RblQAGAY YG1 
KPLPPQI PPQMPP0I PCYQPLGQQVPHMPLAKDGLAI^JGKEMPHL 
OYGKEYPHbPQYMKEI OPAPRMGKEAVPKKGKEI PLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGE3GQKGEIGPKGIP*PQGPPGPHGLPG3GK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\?GAPGEP 
GP0G P I G V ?G VQG F PG 2 PG I GKPGQDG \ I P GQPGFPGG KGEOGL 
PGLPGPPGLiPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 
PKGEPGLOGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLFG VPGJjijG.FKG.bPbl PGDQGL»V>GPPGI FGiGGF£>GPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPOG0PGL 
PGP PGPPGP PGPPAVMPPTPPPOGEYLPDMGLGI DGVKP PHAYG 
AKKG KNGG P AYEM PA FTAELTAP FPPVGAP VKFNKLLYNGRON Y 
NPQTGIFTCEVPGVYYFAYHVHCKGGNVWVALFKNNEPVMYTYD 
EYKKGFLDQASGSAVLLLRPGDRVFLQMPSEOAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSASI^DKy^PKEKTAMCLVNELARFNRVQPQYKLLNER 
G PAHSKM FS VQLS LGEQTWES EGSS I KKAQQAVGNKALTESTLP 
KPI*KPPKSNVNNNPGCITPTVEIjNGLAMKRG\KPAIHRPLDPK 

PFPJWRZkNYTJFOVftfYNORYHf'PT PKTFYVOT.TVfiNNFFFfS^RKT 
ri ri'ii\fvMJ> iwryvr:i HyftiijLri Jr AX r l vyul voixjx&r r o*>v7H£ 

rqaarhnaamkalcalonepipers PQNGESGKDMDDDKDANKS 
eislvfeialkrnmpvsfevikesgpphmksfvtrvsvgefsae 

GEGNSKXLSKKRAA'J'TVLOELKKLPPLPVVEKPK\HFFKKRPKT 
I VKAG PEYGQGMNF 3 SRLAQIQQAKXEKEPDYVLI^SERGMPRRR 
EFVMQVKVGNEVATGTGPNKXIAKlOJAAEAMliQI^YKASTNIiQ 
DQLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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BNSOOCIO: <WO 0l53312At_L> 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
araino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino scid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
II=Histidine ( I=Isoleucine, K*Lysine, 
L= Leucine, M=Methionine, Kr Asparagine, 
P= Proline, Q=Glutamine, Rt=Arginine, 
S=Serine, T=Threonine, V=V^line, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RHKVISGTTLGYLSPKDMNQPSSSFFS1SPT3NSSATJARELLK 
NG TS ST AE A I GLKG S S PT P PCS P VQPS KQLE YLAR 1 QG FO VK Y C 
DROSGKECVTCLTLAPVQMTFHAIGSS IEASHDQV* YATAILLC 
YGPARKVJKAIKMEAKCAHAALLSLIHYLLAPSAPLEKSKLFALG 

N- 


5846 


1126 


4 56 


FSKLIKKTFIIGISGVTNSGKT7LAKNLQKHLPNCSV1SQDDFF 
KPESEIETDKNGFLQYDVLEALNMEKMMSAISCWKESARHSWS 
TDQSSAEEIPILIIEGFLLFNYKPLDT1 WNRSYFLTIPYEECKE 
RRSTRVYQPPDSFGYFDGHVWPMYLKYRQEMQDITWEWYLEGT 
KS EEDLFLOV Y EDL1 QELAKQKCLQVTA* RRNTTNPS / CK* I RK 
LQGVI 


5847 


2769 


505 


APEMEDLSSPDSTLLQGGHNLLSSASFQESVTFKDVIVDFTOEE 
WKQLDPGORDLFRDVTLENYTKLVSIGLQVSKPDVISOLEOGTE 
PVUMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
KKRDDSWSSNLLESKETEGSLERQQANOQTLPKEIKVTEKTIFS 
W E KGPVNNE FG KS VNVSSNLVTQEPS PEETSTKRSI KQNSN PVK 
KE KS C KCNE CG KA FS Y CS ALI RHQRTHTGE KP Y KCN* / C VE KAF 
S R S EN LI NHQR I HTG DK P Y KCDQCGKG FI EG PS LTQHQR I HTGE 
KP Y KCDECG KA FSQRTHLVQHQRI HTG EKP YTCNECGKAFSORG 
HFKEHQKIHTGEKPFKCDECDKTFTRSTHLTQHQKIHTGEKTYK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
H0KTHTGEKPYDCAECGKSFSYWSSLAOHLKIHTGEKPYKCNEC 
GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 
HTQEKAYECKECGKAF1RSSSLAKHERIHTGEKPYQCHECGKTF 
* SYGSSLI QHRKI HTGERP YKCNECGRAFNQNI HLTQHKR I HTGA 
KP Y ECAECGKA FRHCS S LAQHQKTHTE E KPYQCNKCEKTFSQ SS 
HLTQHQR I HTG E K P YKCNECDKAFS R S THLTQHQR I HTGE KP Y K 
CNECGK\TFSOSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSALN 
KHQRLHPGI 


5848 


22 


2961 


AA PR R LLRGGDG DRTPR F PL P ALLR PGP P AEAAP ER RKM P A V S K 
GDGMRGLAVF1SDIRNCKSKEAEIKR3NKELAN1RSKFKGDKAL 
DG YS KKK Y VCKLLF I FLLGHDI DFGHM E A VNLL S SNR YTE KQ I G 
YLFISVLVNSNSELIRL1NNAIKNDLASRNPTFMGLALHCIASV 
GSREP4AE AFAGE I P KVLVAGDTMDSVKOSAALCLLRLYRTS PDL 
VPMGDWTSRWHLLNDQHLGWTAATSLITTLAQKNPEEFKTSV 
S LAVS RLS \ R I VTS AS TD LQDYTY* FCPGFLGLSVKLLRLLQCY 
PPPDPAVRGRLTECLETI LNKAQEPPKS KKVQHSNAKNAVLFEA 
I SLI I HHDSEPNLLVRACNQLGQFLQHRETNLR YLALESMCTLA 
S SEFS HEAVKTH 1 ETV I NALKTERDVS V RQRAVDLLY AMCDR S N 
A?QI VAEMLS YLETADYS IREEI VLKVA I LAEKYAVDYTW\Y\T) 
T I LNL I R I AGDYVS EE VWYR V I QI VINR DD VQGYAAKT VFEALQ 
APACHENL VKVGGY I LGEFGNL I AGDPRS S PLI QFHLLH S KFHL 
CSVPTRALLLSTYIKFVNLFPEVKPTIQDVLRSDSQLRNADVEL 
QQRAVE YLRLS T VA S TDI LAT VLESMPP FP ERES S I LAKL KK KK 
GPSTVTDLEDTKRDRSVDVNGGPEPAPASTSAVSTPSPSADLLG 
LGAAPPAPAGPP PS SGGSGLLVDVFSDS AS WAPLAPGS EDNFA 
RFVCKNNGVLFENOLLQIGLKSEFRQNLGRMFIFYGNKTSTOFL 
NFTPTL3 CSDDLQPNLNI iQTKPVDPTVEGGAQVQQWNIECVSD 
FTEAPVLNIQFRYGGTFQNVSVQLPITLNKFFQPTEMASODFFQ 
RWKQLSNPQQEVQNI FKAKHPMDTEVTKAKI IGFGSALLEEVDP 
NPANFVGAGI I HTKTTQIGCLLRLEPNL-QAQMYRLTLRTSKEAV 
SQRLCELLSAQF 


" 5849 


3545 


1895 


KRREIKETVFHHVAQAGLELLSSSNPPSSASRSAGITGMRHQVQ 
P*D?CMSLSPPCFTEEDRFSLEALQTIHKQMDDDKDGG1EVEES 
DEFI REDMKY KDATNKHSHLHREDKHI TI EDLWKRWKTS EVHNVJ 
TLEDTLOVJLI E FVELPQYEKNFRDNNVKGTTLPRI AVKBPS FMI 
SQLKISDRSHROKLOLKALDWLFGPLTRPPHNWMKDFILTVSI 
VI G VGG CW FAYTQN KTS KEHVAKMM KDLES LQTASQSLMD LQER 
LBKAQEENRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 
CELSRRQYAEQELE0VRMALKKAEKEFELRSSWSVPDALOKWLQ 
LTHEVEVQYYWI KRONAEMOLAIAKDEAEKl KKKRSTVFGTLHV 
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BNSDOCID: <WO 0153312A1J_? 



WO 01/53312 



PCT/llSOO/34263 



SEQ 
ID 
NO; 


Fredictec 
beg inn inc. 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acic segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=lsoleucine, K^Lysine. 
L=Leucine, M=Methionine, N=Asparacine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«=Vsline, 
W^Tryptophan, Y -Tyrosine, X= Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDEVDHKI LE A KKALS ELTTCLRE RL ?RW0Q I E K. I CG FQ 
IAHNSGLPSLTSSLYSDHSWVVMPRVSIPPYPIAGGVDDLDBDT 
PPIVSOFPGTfAAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHFRHPHHPQHTPHSLPSPDPD1LSVSSCPALYRNEEE 
EEAIYFSAEK0WEVPDTASECDSLNSSIGRKQSPP/SKPRD1PN 
II S/DERYQEMRCP* RI PSGGIL 


5850 


3 


1895 


KAVliNFSASGSVISLTGSNPMHDASMIfTKLKKNGIIVYLDVPLLN 
LI CRLKLMKTDR I VGONSGTSMKDLLKFRRQYYKKWYDARVFCE 
SGASPEEVADX VLNA I KR YODVDSETFI STRHVWPEDCEQKVSA 
EFFIEAVIEGLASDGGLFVPAXEFPKLS CGEWKSLVGATYVERA 
QI LLERCIH P ADI PAARLGEMI ETAYGENFACS KI APVRHLSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQR1AWAFFPENGVSDFOKAQIIGSQ 
KENGWAVGVESDFDFCQTAIKRIFKDSDFTGFLTVEYGTILSSA 
KS INWGRLLPOWYhTiSAYLDLVSOGFISFGSPVDVCl PTGNFG 
K I L AAVY AKMMG I P I R K FI CASNQNHVWTDF I KTG \H YDLRG KE 
N*AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHH FQ I EKALVEKLQQDF VADWCS EGECLAAINSTYNTSG Y ILD 
PHTAVAKWADRV0DKTCPVIISSTAHYSKFAPAIMOALKIKEI 
NETSSSOLYLLGSYNALPPLHEALLERTKG^EKMEYOVCAADKN 
VLKSHVEQLVONQFI 


5852 


3120 


1802 


RCYLOFLALLLTSTSARAAAAI AAAEE PAGS PSVMTRAGDHNRO 
RGCCGSLADYLTSAKFLLYLGHSLSTWGDRMWHFAVSVFLVELY 
GNS LLLTAVYGLWAGS VLVLGAI I GD WVDKN AR L KVAQTS LW 
QNVSVI LCGI ILMMVFLHKHELLTMYHGWVLTSCYIL-IITIANI 
ANLASTATAI T 1 ORDWI VWAGEDRS KLANMN ATI RR I DQLTNI 
LAPMAVGQI MTFGSPV1 GCGFI SGWNLVSMCVEYVbLWKVYQKT 
PALAVKAGLKEEETELKOLNLHKDTEPXPLEGTHLMGVKDSNIH 
ELEHEOEPTCAS QMAEP FRTFRDGWVS Y YNQPVF / LGKHGSCFP 
LYDCPGL* LHHHRVRJuHSGTEWFHPOYFDGS I SYNWNNGNCS FY 
LATSKMWFG S DRSDLRI GTAFLFDLVCDLCI HAW K P PGLVRFSF 


5e52 


1 


422 


KTTFPS H LC PI jRQLPEVRG YSGQPLTDP L I SLCR SHKCRG KG WG 
SSS YPSLPALLRARS AFGHCTHRSCG PEWRI DS I SRLEMQGARR 
SGW AOAQPT I LL1»V PRLR KSLPS I WG / S L.MGF FI T S GPG / W FRQ 
YYFFISGRH*VLFTESDFYYVAMDrGGHGLSSKYSPGVPYYLQT 
FVS E I RR WAG K KQS VY FRR CGGCS RA P P LI TGGG VG SRKQRWP 
ESGAWAliAPGLPAIHGRSWES 


5853 


223 


1346 


RLLGLSRVKGLHGPAASAW1SDPETKGDPGGPWGMWRGSDLRPR 
P VS LTG LTLV CK * AAQGPQ V\ HS VKLCFG LGG \ P CLL\ FP I FR P 
LLLKPRRPPXK PGTRGVAVEPHALRWHVAHGEE AGI RAAGPGH 
GGVEIPOG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGlaGQDQGPREQOKQGSGRHDTILGDMGESE 
SRWVRGNFRTGTAATLIGFSRNPTLNGSENWGSLVSIQEEGPDT 
GWEREKRNPAEMGNP0RWASPIHTPPIX3PEILRA>1PEALRAMPE 
ALGLRPDPATSVPSALS/QTF/PESWPRSCLRNQGETLGMGPVP 
LSSLCI TES PSQW WTPCLLLLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKGAALNNRENASS * NGY/S RWKQDI RRIENHIIQE 
LXHLCAMI XRVLLERLENTRKLRELTEGR TLDWPQNRI TEVSAK 
RQI VTE YREKGKRJ** EEKKRDLEGRSRRYNLCI1 G I PETEDRAS 
GAETI KDLLE/ENFPELKNELDLQMEKAHR I PLK FNEKKAASRH 
IRVTFL / KFQRRN I LQASSQRKQVTY KG AKVRLTSDFS PAI LNA 
RR0W/N/PISRVLRENNFEPR1IYSAKLSFLYKGNWKTFLDIQG 
LGKYINQELS LK 1 LLKDLLQLTENLN 


5855 


536 


2391 


LRS YGCKAPSR 1 S KLHK\ FLFLLLPS LLMGYS ES PPP I TDSWAP 
FISLTKHVLSOSOSPLSSNCWI CLSTHTQ* FTALPADLLTWTQS 
NVSLHI SYLAI PFLADS FLKPV/L* PGKSAKHLS FKLS SLSMVS 
GRAVALLHLIASGLTSIQTNTASSKPPIWGY\LSTOTSFISPPP 
LCLSRT YPN PAHATMVGQVPQS LCGLI FTL /RTP CR PS I LH PNY 
KIISTSAWQKVLCFSGSPTIHTSLHLTTGSSFLSFKPIPGFPAA 
NSALYVS SLKGP PGKNVT I PS PVTGT* OPPHRGSN/ RLTVDKDN 



388 



BNSDOCID: <WO 0153312A1_l_5 



WOOJ/53312 



PCT/US00/34263 



SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding • 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acic segment containing signal peptide \ 
(A=AIanine, ocysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Pbenylalanine, G=Glycine, 
H=Histidine, I=3scl eucine, K=Lysint, 
Ii=Leucine, M^Methionine, N-Asparagine, 
P=Prcline, Q=Glutamine , RoArginine, 
S=Serine, T-Threcnine, V=Valine, 
K=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSFKPNSLHObPSQ\TFyQALTGAALAGSYPlWENSNTLSWL 
F7FT^KFCL£7PSLFFLCDTN*YLCLPANWSGTCTLVFQAPTIN 
I LP PNQTI L I S VEAS I S SSP IRNKWALHLI TLLTGLG 1TAALGT 
GIAGITTS I T3YCTLFTTLSNTVEDMHTS I TSLQRQLDFLVGVI 
LONWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL * HQVADS WWQGSS LLRWIPWVAPFLGPLI FLFLLLMIGP 
C 3 FN bVS RF1SQRLNCFI QASMQKH IDNIFH LCHV ♦ YQS LRGNH 
SEAPEPRP 


5856 


173 


1137 


PKLHGLGLSAVFLFYL* /Y VTFHLYGGI 3LLLLIFISIAGILYK 
FODVLLYFPEQ PSSSRL Y VPM PTG I PHENI FIRTKDG 1 RLNLI L 
I R Y TGDNS PYS PTI I YFKGNAGNIGHRLPNALLMLVNLKVNLLL 
VDYRG YGKS EG E AS E EG L YLDS EA VLDYVMTS PDLDKTKI YLSG 
RS LG \ G AAA I H LAS DNSHRISAI M VENT FLS I PHMA S TLFS FFP 
KRYL»PLWCYKNKFLSYRKISQCRMPSLFISGliSDOLlPPVMMKQ 
LY ELS PSRTKR LAI PPDGTHNDTWQCQGYFTALEQFI KEWKSH 
SPEEKAKTSSNVTI1 


5857 


1597 


563 


KLIGKVLVLSWADAMAAFAVEPO^PALGSEPMMLGSPTSPKPG 
VNAQFLPGFLKGDLPAPVTPQPRSISGPSVGVMEMRSPLLAGGS 
P PQP WPAHKD K SGAPP VR S I YDD ISSPGLGS TP LTS RRQ PNI S 
VNQS PLVGVTS TPGTGQS M FS PAS I GQPRKTTLS PAQLDP FYTQ 
GDSLTSEDH\LDDSWGDCIWGFLKASA\SYILL\QFAQYGGIS* 
^WMSNTGNWMHIRYQSKLOARKALSKDGRIFGESl^IGVKPCl 
DKSVMESSDRCALSSPSLAFTPPIKTLGTPTOPGSTFRISTMRP 
LATAYKASTSDYOVISDRQTPKKDESLVSK/iMEYMFGW 


5858 


355 


1419 


FPHQPAAASTSXHQQQQPPPPPQDSSKPWAQGPGPAPGVGSAP 
PAS SSA PPATPFTSGAPPG SGPGPTPTPPPAVTS AP PGAP P PTP 
PS S GV? TTP Pp AGGP P P P P AAVPGPG PG PKQG PG PGG P KGGKMP 
GGFKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPFPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 
IXGIYLLISRRMNSRRLFAKrWENQEKFLSTKAKDSEFIKLESR 
ALA*NC?KFELG * YTP+GGRQLPSSLFPTHACLPLSCSV3 FSPF 
MFPQ * NCWGR K P FR PNLG PHL fCGAVCNR WDDPWEG PTGKGHCLN 
FAS 


5859 


307 


1503 


GG£ SARPRASS KRMLSRKKTKNEVSKPAEVQGKYVKKETSPLLR 
NLMPSFIRHGPTI PRRTDJ CLPDSSPNAFSTSGDGWSRNQSFL 
RTF IQRTPHEI NRRESNRLSAPSYLARSLADVPREYGSSQSFVT 
E VS FA VENGDS G S R Y Y Y S DN FFDGQR KR PLGDRAH E D YRY Y E YN 
HDLFORMPQNOGRHASGIGRVAATS LGNLTNHGS EDLPLPPGWS 
VDKTMRGRKY Y 1 BHNTNTTHWSHPLEREGLPPGWERVESSEFGT 
YYVDHTNKKACY \RHPCAPTCTSV* STTSOTI/AS / RQQTERNQ 
SLLVPANPYHTAEIPDWLQVYARAPVKYDHILKWELFQLADLDT 
Y03MLKLLFMKELEQrVKMYEAYRQALLTELENRK0RC?QWYAQQ 
HGKNF 


5860 


2956 


1270 


T I R VE E F PLCPG GG KAQLS S ASLLGAG LLLQPPTP P PLLLLLFP 
LL hFSR LCGALAG P I IVEPHVTAVWGKNVSLKCL3 EVNETITQI 
SWE K I HGKSSQT VAVHHPQYG FS VQGE YQGRVLFKNYS LNDATI 
TLHNIG FSDSGKYI CKAVTFPLGNAQSSTTVTVLVEPTVSLIKG 
PDSLIDGGNETVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETATUSOYKLFPTRFARGRRITCWKHPALEKDIRYSFILDI 
QY A PEV S VTG YCGNWF VG RKGVNbKCNADANPP P FK S VWSRLDG 
OWFDGLLASDNTLHFVHPLTFNYSGVYICKVT\NSPGSKEVTQK 
VH PTFQDP S LPT YP PLPALQFQWAS PSTA * TSRD\ LATE P+KIA 
PSPLSTL\ATIKGWTQLPTIIA+CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGDY FAKNYI PPSDMQKESQIDVLQQDELDPYPDSV 
KKENKNP VNNLI RKDYLEEPE KTQWNNVENLNRFERPMDY YEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACV0AFWLVASSGDDSQGGDKCGCEVGSl'JVGSMRWt4ARLL 
SEGEQGIPTACAAFAQQPAG/EPRRGLAGVGEGGPOCSWVNYRC 
TLEFLVSLLGTDLARGRGNSASGPTAPADSKQL/ ML* DVHRRVI 
LE* RMNSGS PAR DNAPSQR FCTNLSEGLRFG IS PSWREALYGCH 



3S9 



BNSDOCID. <WO 0153312A1_I_> 
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SEQ j 
ID j 
NO: j 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue oi 
amino acic 
sequence 


Predicted end 
nucleotide 
location j 
corresponding j 
to first 
amino acid 
residue of 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyste.ine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 

IT T_j - p h /i i no T Te r~\*i on/^i r»D 1/ 1 v/eTf\o> 

H=ri~ StiGine , i - i box LuLine , tv^i^ys 111 e , 

L>— ueuCilJc | rlC uniUIllIJc r IN - ttSpci lag JLI1C 9 

P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








A 


5862 


lSbt 


483 


PPFOLIMGEIKVSPDYNWFRGTVPLKKIIVDDDDSKIWSLYDAG 
PRSIRCPLIFLPPVSGTADVFFRQILAbTGWGYRVIALQYPVYW 

rvi c rrrWDVi 1 r>u7 rvr r>v\ruT vr* her i^ppt Tv^lfPfiPVTUifco 
Lr.L>Lr v_LHjr KAJjJjUnijU.ljL'iV. vnljr wv&Lnaur iiAUIVrMt 1 1 n\or 

RVHSLI LCK S FSDTS 1 FNQ TWTAN S F W LM PA F MLKKI VLGNFSS 

GPVDPM>4ADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPH^I 

RDIPVTIMDVFDQSALSTEAKEEMYKLYPNARRAHLKTGGNFPY 

LCRSAEVNLYVQIHL/R/RNSMEPNTRPLTHQWSVPRSLRCRKA 

ALASARRSSSVSliAVNDELTRCVLV*SVASAPVSRPFPSGSSGS 

PVLTVSGK 


5863 


2714 


249 


PFPSRGSLFLAAPREDTMGPLMVLFCLLFL^PGLADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGIjYPSPASRLCKSSGQWQ 
TPGATRS LS KAVCKPVKC PAP VS FENG I YT PRLGS Y PVGGNVSF 
ECEDGFI \LRGS PVRQCR PNGMWDGETAVCDNGAGHCPNPG ISL 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECOGNGVWSGTE 
P J CRQ P YS YDFP ED VAPALGTS F SHMLGATNPTQK TKES LGR KI 
QIQRSGHLNLYLLLDCSQSVS ENDFL1 FKESASLMVDRIFSFEI 
NVS VA 1 1 TFAS E P K VLMS VLN DN S RDMT E V 1 S S LENAN Y KDHEN 
GTGTNTYAALNSVYliMMNNOMRbLGMETMAW\OEIRHAIILL\T 
DGK\SHMGGSPKTAVDHIREILN1N0KRNDYLDIYAIGVGKLDV 
DWRELNS LGS K KDG ERHAF 1 LQDTKALHQV FEHM LDVS KLTDTI 
CGVGNMSANASDQERTPWHVT1KPKSOET\C\RGALISDOWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIFKAVISPGFDVFA 
KKNQG I L\EFYGD\ DI ALL\KLAQKVKM \S7HCQGPSChP\ CTM 
\EANLGFLRETFKGSTCR\DHENEL/V\WKQSV\PAHF\VA1)\N 
GSKLEHLTLRMGVEWTSCCRGLSPKKKTM\FFNLT\DVRB\WT 
D\OFL\CS\GPOEDESP\ CK * E\SGGA\ VFLF KRKRLSAGGVWC 
SWGL\YNP\0£SA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q* S PWLROHPGGMS * I FLPLLANGKLS P FACFAR I CRPLKFLPS 
EKATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPbbCFPGGOEPSAPSPCLYSFLWACSFTMG 
KLPPS I P PSS PLAC VLKNLK P LQLTPDLK P KCU FFCNTAWPQY 
KLDNDSK*PENGTFEFSIL0VLDNSCHKMGKV3SEVPDVQAFF\S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPFHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGZVKVSAJPFSLSQIR*RL 
GSFSSNI KIQPSSWL1 WQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTIVGC1 FFKTAI 2 SHFKGGMYLCVCMCTC 
LSVCVCVQVGSWI CV/CVSMCACVSLCTC\ ICRCI SMYTREHAC 
ACTRV * V YMCMS / VCTCVS TCI DVRVCAHV CVY MCLCLG YA* AC 
TCV*MCVCMHEKVCMC/VCACSCVLL/CRGHI CM/MCMSAY1 CI 
/ CVY VCVLCW1ACMRMSTCVK LVYG* ACTCVWMHM/ CSCTCR/ C 
VHVCCMSMHACECLO?YI>H1CGCAGTRRWWAGSARGSRSCSRIiP 
CWAPGPGLSLPGPSCPSVEOGLGGGPGQLQGRSGEARIiGEHRGH 
GS PAAVCS RNCTVS PRRGADC F£ APDVP KO PPG WGRAS FEERGC 
GGRGW VCAPPLKGPQCCCFS 1 KPELKAKXKK 


5866 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKKNKGKERRDLDDL 
KKEVAMTEHKMS VEE VCRKYNTDCVQGLTKSKAOE I LARDG PNA 
LTPPPTTPEWVKFCROLFGGFS ILLWIGAI LCFLAYGIQAGTED 
DPSGDNLYLG I VLAAWI I TG C FS YYQE AXSS K 1 MES FKNM VPQ 
CALV I REG E KMQVKA EE VWG DLVE I KGG DRV P ADLR 1 1 S AHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGVWATGDRTVMGR I ATLASGbE VGKTP I AI E1EHFIQLI TGV 
AVFLG VS FF I LS LI LG YTWLE A V I FLIG I 1 VANV P EGLLATVTV 
CL/TLTAKR MAR KNCLV KNLEA V ETLGS'f S T 1 C5D KTGTLTQNRM 
TVAHT<WFDNOIHEADTTEX)OSGTSFDKSSHTWVALF*H/LLGFC 
NRPVFKGGQDNI PVLKRDVAGDASESALLKCI ELS SGS VKLMRE 
RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVKKGAPERILD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEEOFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGI KVI MVTGDHP ITAKA I AKGVG 1 1 FEGNETVEDI AARL 
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SEC I 
ID 

NO; | 

i 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end \ Amino acid segment containing signal peptide 
nucleotide 1 (A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
location j Glutamic Acid, F=Phenyl alanine, G=Glycine, 
corresponding j KsHistidine, I=Isoleucine , K=Lysine, 
to first 1 ).= Leucine, M=Methionine , N-Asparagmc, 
amino acid | P= Proline, Q=Glutan>ine, R^Arginine, 
residue of | S=Serine, ^Threonine, V=Valine, 
amino acid i W=Tryptophan, Y=Tyrosine, x»Unknown. *=Stop 
scguence 1 Codon, /=possible nucleotide deletion, 
I \=possible nucleotide insertion) 








KIPVSQVNPRDAKACVIHGTDLKDFTSEQIDEILQNKTEIVFAR 
7SPQQKLIIVEGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGI 
TiGSDVS KQAADM I LLDDNFAS I VTGVEEGRLI FDNLKKS IAYTL 
7SNIPE3TPFLLFIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESDI MKRQPRNPRTDKLWERLISMAYGQIGK1 OALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDLEDSYGOOWTYEORK 
WEFTCHTAFFVS1VWQWADLIICKTRRNSVFQ0GMKNKILIF 
GLFEETALAAFLSYCPGMDVALRMYPLKPSWWFCAFPYSFLIFV 
YDEIRKL I LRRNPGGWVEKETYY 


5867 




1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSSPVAKP 
GPVKTLTRKKNKKKKRFWKSKAREVSKKPASGPGAWRPPKAPE 
DF SQN WKALQE WLL KQ KSQAP E K PLV I S QMG S KKK P K 1 1 QQNK K 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAFVPRTKASGT 
EHNKKGTKERTMGDIVPERGDIEKKKRKAK\GQPOPHPPR/IDI 
MFDDVDPADIEAAIGPEAAKIARKOLGQSEGSVSLSLVKEQAFG 
GLTRALALDCEMVGVGPKGEESMAARVSIVNOYGKCVTDKYVKP 
TEPVTDYRTAVSGIRPENLKQGEELEWQKEVAEKLKGRILVGH 
ALHWDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
j LGLQVQQAEH CS I QDAQAAMRLY VM VKKEWE S MAR DRRP LLT A 
PDHCSDDA*QSCPAAAAAPIiQRQCDQSCGQITSPOSGNSGETFS 
ESWQRGVAWCY 


5866 


2122 


833 


LTAGAS HTQDASQSTS AKYPAAAQNL/ CVTNAMR EDLADI W YI R 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
7ERS AFTERDAG SGL VTRLRER PALLVS S TS WTEDED FS I LLAA 
LESRV* T\MTLDGHNLPS1»VCVI TGKGPLREYYSRLI HQKHFQH 
I OVCTPWLEAED Y PLLLGS ADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAOIiCMLFSNFP 
DFAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LT AG ASHTQDASQ S TS AKYPAAAQNL / C VTNAMRE DLiAD I WY IR 
A V TVYDKP AS FF K ETFLDLQHRLFMKLGSMHS PFRARS EPE DP V 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSILLAA 
LESRV* T\MTLDGHNLPSLVCVI TGKGPLRBYYSR LI HQKHFQH 
J OVCT P W LEAED YP LLLGS ADIiG VCLHTS SSGLDL PMKWDMFG 
C CLP VCA VNFKCLHELVKHEENG LV FEDS EELAAQ LQMLFSNF P 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGAS H TQDAS QSTSAKY PAAAQNL / C VTNAMRE DLAD I WY I R 
A VTVYDK PAS F FK ETPLDLQHR LF MKLG SMHS PFRAR S EP EDP V 
TERSAFTERDAGSGLVTRLRERPALLVSSTS WTEDED FS3LLAA 
LESRV* T\MTLDGHNLPSLVCVITGKGPLREYYSRLI HQKHFQH 
^OVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLFMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS 
VLKLL* LSLRRL* LEPTI *NGLLT+ CSRLS VFRFLKV\GSVYEP 
LKSITfLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGGDOKAKIQDSLYCAAGAWALALAYRRIDDDKGRTItELEHSAI 
K CMRG ILYCYMRQADKVQQFKQDPRPTTCLHS VFNVHTGDELLS 
Y EEYGHLQINAVSLYIibYLVEMISSGLOI IYNTDEVS FIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
I- * KQFNGFNLPGNQGCSWSVI FVDLDAHNRNRQTLCSLLPRESR 
SKNTDAALLPCISYPAFALDDEVLFSQTLDKWRKLKGKYGFKR 
FLRDG YRTSLEDFNRCYYKPAEI KLFDGI ECEFPI FFLYMMIDG 
VFRGNPKQVQEYODLLTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRIM3KLFLWGCALY I IAKLLADE LIS PKDI 
DPV0RYVPLKDQRNVSMRFSNQGPLENDLVVHVAL1AESQRLQV 
FLNTYG1 QTQTPOOVEPIQI WPOOELVKAYLQLG I NEKLGLSGR 
P DRP I GCLGTS KI YRI LGKTWCY PI I FDLSDFYMSQDVFLLI D 
D • KNALOFIKQYWKMHGRPLFLVLI REDNIRGSRFNP 1 LDMLAA 
LK KG 1 1 GG VKVHVDRLQTLISGAWEQLDFLRI SDTEELPEFKS 
F E ELEPP KHS KVKRQS STPSAPE LGQQPDVN I S EWKD KPTH EI L 
Q K LNDCS CLASQA I LLG I LLKREG PNF 1 TKEGT VSDH 1 ER VYRR 
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SEQ 
ID 

NO: 


predicted 
bee inning 
nucleotide ' 
location 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Am no acid segment containing signal peptide 
(A=Aiar.ine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=IsoIeucine, K=Lysine, 
L= Leucine, M=Methi onine , N=Asparagine , 
P=Proline, Q=Glut amine , RsArginine, 
S=Serine, T=Threonine, V^Valine, 
W» Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Ccdon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AGSOKLWS WRRAASLLS KWDSLAPS 1 TIWLVQGKQVTLGAFG 
HEEEVISNPLSPRVIQNI1YYKCNTHDEREAVIQQELVIHIGWI 
3 SNHPELFSGTLKIRIGWI IHAMEYEL03 RGGDKPALDLYQLSP 
S E V KQLLLD I LQ PCQNGR CWLNRRQ I DGS LNRTPTG F Y DR VWQI 
LE RTPNG 1 1 VAG KHLPQQ PTLS DKTM Y EMN FS LLVEDTLGN I DQ 
F0YRQIWELLMWSIVLERNPELEFODKVDLDRLVKEAFNEFQ 
KDQSRLKE1 EKCDDMTSFYNTP PLGKRGTCS YLTKAVMNLLLEG 
E V K PNNDDPCL I S 


5872 


68 




V ()G Y MY R FV I K I N SCY S E KTS I CR K R CCPE L PATQPW PTPT VF F 
NIAIDSESLGCIXSFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYRIIPG\LCQGGDFTHHNGTGGKSLYSKEFDDENFl/bKK 
T APG VLSTANAGPTTNGSQFF I CTAKTEDG * QHWFGKVKDGMS 
IVEALERSGSRNGKTSKKITAANCGQL 


5B73 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GECVGPN KCRCF PGYTGKTCSQDVN ECGMKPRPCQHRCVNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCLCP 
ESGLRLAPNGRDCLDIDECASGKV2CPYNRRCVNTFGSYYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KOGYKGNGLRCSAI PENSVKEVLRA PGTIKDRI KKLLAHKNSMK 
KXAK3 KTTVTPEPTRTPTPKVNI^PFNYFFIV^RGGN^Hf;C5\KKR 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGL1 L\VQRKALTSKLEHKADLN1 S VDCSFNHG\ICDW\KQDR\ 

EDDFDW\NPADR\DNAI \gfy\mavpglwoghk\kdigrlklll 

PDLOPQSNFCLLFDYRLAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKWKTGKJQLYQGTDATKS 1 1 FEAERG KG KTGE I AVDGVLLVS 
GLCPDSLliSVDD 


5874 


2 


3387 


acprij^rrrrvrslrjrrrgwij^arwsrgonkmaarritoetfd 

AVLQEKAKRYHMDASGEAVS ETLQ FKAQDLLRAVPRSRAEM YDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECXEKE\ 
S RDYD VDHSG \ E A \ DS VLRG S \ S QVO A\ RG RALNI VDQEGS LLG 
. KGETOGLLTAKGGVGKLVTLRNVSTKKI PTVNRITPKTQGTNQI 
OKNTPS PDVTLGTNPGTED 3 QFPICKI PLGLDLKNLRLPRRKMS 
FDIIDKSDVFSRFGIEIIKWAGFHT1KDDIKFSQLFQTLFELET 
ETCAKMLAS FKCSLKPEHRD FC FFT I KFLKHSALKTPRVDNEFL 
NMLLDKGAVKTKNCFFEI 3 KPFDKY 1 MRLQDRLLKSVTPLLMAC 
NAYELSYKMKTLSNPLDlAbALETTNSLCRKSLALLGQTFSLAS 
S ? R0EKIL*AVGLQD1APSPAAFPNFEDSTLFGREY IDHLKAV7L 
VSSGCPLQVKKAEPEPMREEEKMIPPTKPEIOAKAPSSLSDAVP 
ORADHR WGTIDQLVKRVI EGSLS F KERTLLKEDPAYWFLSDEN 
S LE YK Y YKLKLA EMQRMS ENLRGACQKPT5 ADCAVRAI4LYSRAV 
RNLKKKLL P\ WQRRGLLRAOG \ LRG \ W KARRA\ TTGTQTLL FLR 
APGLKHHGRQAPGliS \QAKPS LPDRND\ AAKD\CPLDPV\GPSP 
ODPSLEASGPSPKPAGVDISEAPQTSSPCPSADIDMKDNGRTAE 
KLAR FVAQ VG \ PE I EQF \ S 1 \ ENS TDN PDLW FL\HDQNS S \AF K 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
G E AE FEDE PPPRE AELESPE VM PEEEDEDDEDGGEEAPA\ PGRG 
G PS LEGS T PADGLPGEA\ AE DDL/ ALGAPALFTGLLC; VTCFP FG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPKS 
KFJCKPKDLDFAOQKL\TDK\NLGFO\MLOKMGWKEGHGLGSLGK 
GIR\SRSACTQQAAWGGSGWGLSPSTCSLPLGSFTAKMAYSWQL 
1FV? 


5875 


296 


1848 


LAALGGLPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LEFS GSLFPHAI CLGDVDNDTLNEL WGDTSGKVS VYKNDDSR P 
WLTCS CQG MLTCVG VGDVCN KG KNLL VAVS AEGWFHLFDLTPAK 
VLDASGHHETLIGEEQRPVFKOHI PANTKVMLI SDIDGDGCREL 
WGYTDRWRAFRWEELGEGPEHLTGQLVSLKKWMLEGQVDSLS 
V7LGPLGLPELMVSQPGCAYAILLCTWKKDTGSPPASEGPTDGS 
/SGDPSCPRRGAAPDIWPYPQOECLHSPNWQHQT\SHGTESSGS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

J. OkJ> V.J 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
fTirrp^nond i no 

to first 
amino acid 
residue of 
amino acid 
seouence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine , 
H-Histidine, i s 2soleucine, K=Lysine, 
L= Leucine, .M=Methionine, N=Asparagine , 
P=Proline, 0=Glutamine / R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GLFALCTLDGTLKLMEEMEEADXLLWSVQVDHQLFALEKLDVTG 
NGHEEWACAWDGC-TY 1 1 DHNRTVVRFQVDENIRAFCAGLYACK 
EGRNSPCLVTVTFNQKIYVyKEVOLERMESTNLVKLLETKP\ST 
TACCRSWAWILTTSL* LVPCFTKRSTIOTSHHSVLPQASRI PPS 
KTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ 


5B76 


1122 


224 


HLPLGVPS KVAGAAAMEPQEERETQVAAWLKKI FGDHPI ?QYEV 

kprtteilhhlsernrvrdrdvylviedlkqkaseyeseakylq 
dllmes vn fs panls s tg s r y lnalvds avaletkdts las fip 
avndltsdlfrtkskseeikielekleknltatlvlekclqedv 
kkaelhlster\akvdnrrqnm\dflkaxseefrfgiqaageql 
sargq\dafsvpiqslvalirenwprlkqqtiplk\kklesyld 
lmp\npshcsk*rieeak\rela\sieaeltrrvs\mmel 


5877 


2030 


1907 


gtlgkmaass sge kekerlggglgvaggnstrerllsaledlev 
lsrelieklaisrnqklloageenqvlellihrdgefqelmkla 
lnogkihhemovlekevekrdsdiqqlqkqlkeaeqilatavyq 
akeklksi exarxgai sseei i kyahr is asnavcapltwvpgd 
prrpyptdlemrsgllgomnnpstngvnghlpgdala/rrxiar 
cpcstvs/ngs0mtcr+ iniililqksvcel 


5678 


950 


2112 


GLWKCMOL0GPKTHRVQP*PTPRQQGPQ\VPVAVIAGNRPNYLY 
RMLRSLLSA0GVSPOM1TVFIBGYYEEPMDWALFGLRGIOHTP 
ISlKIVArCVtyHi KJiSLlAl rNLr r£»AK-T AVVJjfc£.DijL>lAVl/r ri> 
FLSOSIHLLEEDDSLYC1SAWNDQGYEHTAEDPALLYRVETMPG 
LG W VLRR S LY KEE LE P KW P TPE KLWDWDMWMRM P EQRRG REC 1 1 
PDVSRSYHFGIVGLNMNGY FHEAYFKKHXFNTVPGVCLRNVDSL 
KKEA Y E VE VHRLLS E AE VLDHS KN PCE DS FLPDTBGHTYVAFIR 
MEKDDDFTTWTOLAKCLH I WDLDV RGNHRGLWRL FRKKNH FL W 
GVPASPYSVKKPPSVTP1FLEPPPKEEGAPGAPE0T 


5879 


3 


981 


RLTE AAAAGSGSRAAGWAG SPPTLLPLS PTS PRCAATMASS DED 

/T^tfC TV CTArunD f 7\ Tif VT3 DDT -T'TT TV TMiJT TCVn T TV MTTA CU1 V/T 

AIAMVRFYMEKGTHRGLYKSIQKTLKFFOTFALLEIVHCLIGIV 
PTSV1 VTGVQVSS R I FMVWLITHS I KP I QNEES VVLFLVAWTVT 
E1TRYSFYTFSLLDHLFYFI KWARYNFFIILYPVGVAGELLTIY 
AALPHVXXTGMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRR KVLHG \G *L* KRMI K* SLQTRCFFQNNQDYLS PS F 
NNKNKOLCEISWI VWFLK3 


5880 


1138 


1324 


SLWCLVAGGLGLG PSSON PLQRAGILARPREARGTFSALTACSA 
SVTSKGXSSSGMVJPSAASDRDSPVPLRPPGPVQLPSGTGWVLSD 
*KXKRGRCSS/WLSOPCHEREKEWLLRRSMAEGERARAASDVL 
PPQT,7iNPTWni»T?PTT.T7ATZVV?iyiPnHT JlKCLDERfiHAORNVGERSP 
DQSEHTDGHTSVQSVIEKLOEEI^LLKOKVTHVEDLNAKWORYN 
ASREEYVRGLHAQLRGIiQlPHEPFXMRKEISRLNRQLEEKINDC 
AEVKOELAASRTARDAALER V0MLEQ0I LAYKDDFMS ERADRER 
AOS R 3 QELE EKVAS LLHQVS WRQDS REPDAGRI HAG S KTAK Y LA 
ADALELMVPGGKR PGTGSOOP EPPAEGGHPGAAQRGQGDLQCPH 
CLOCFSDEOGEELLRHVAECCQ 


5881 


26 


443 


GG I HPSPTEAPRAOHLTMDCTWR I LFLVAAATGTHAQVQLLQSG 
SEVKKPGASVMVSCYVSGYTLTKLSMm'TVRQAPGKGLE^MGPFD 
LQDVETIYPOKFOGRVSMTEETSTETTQ/AYLELSSLRSEDTAV 
HHCATDTV 


5882 


2407 


2216 


SGC\^KLYSHSLEYNPEWISVQSAVAPAQLALNSDGDL*LHSGE 
RTRRD*QLPEAGGPGLQEPLOLGELDITSDEFILDEVDG\VDLR 
HYSKQVELEL00IEQKSIRDYIQESENIASLHNQ2TACDAVLER 
ME QMLGA FQ S DLS S I S S E 1 RTLQEQSGAMNI RLRN RQAVRG KLG 
ELVD3LW PS ALVTA I LEAP VTEPRFLEQLQELDAKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKIREFILQKIYSFRKPMTiNrYQ 
IP0TALLKYRFFY0FLLGNERATAKE1RDEYVETLSKIYLSYYR 
SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTIF 
TLGTRGSV I SPTELEAPILVPHTAQRGEQRYPFEALFRSQHYAL 
LDNS CRE Y LFI CEF FWSGPAAHDLFHAVMGRTLSMTLKHLDS Y 
LADCYDA3 AVFLCI H I VLRFRNI AAKRDVPALDRYWEQVLALLW 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino scid 
sequence 


Predicted end 
nucleotide 
locatior. 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signsi peptiot 
{A^Aianine, C=Cysteine, D=Aspartic Acid, 
Glutamic Acid, F^Phenyl alanine, G=Giycine, 
H=Histidine, J=lsoleucine, K«Lysine, 
L=Leucine, M=Methionine , NsAeparaci.no , 
P=Proline, 0=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possibie nucleotide insertion) 








PRFELILEMNVQSVRSTDPQRLGGLDTRPHYITRRYAEF5SALV 
SIKQT1PNERTM0LLGOLQVEVEKFVLRVAAEFSSRKEQLVFL2 
NN YDMMLGVLM \ E+ ERAADDS KEVES FQQLLNARTC EF I EELLS 
PPFGGLVAFVKEAEALIERGQAERLRGEEARVT0L1RGFGSSWK 
SSVESLS0DVMRSFTNFRNGTSII0GALT0LI0\LVHRFHRV\L 
SOPQLRALPARAELINIHHLMVELKKHKPNF 


5883 


2 


1374 


EFPC-RR FRAVKEAGAG AGAGAAGWS CPGPG PTVTTLG S YEAS EG 
CERKKGORV5GSbERRGM0AMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVOKGGSVGSLSVNKHRGLSLTETELEELRACVLQLVAEL 
EETRELAGOHEDDSLELOGLLEDERLASAQQAEVFTKQIQQLQG 
ELRSL.PEE7 C.LLEHE/CESELKFI FOELHIiAOAF TOM .POAAFn*; 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSSEPSGSLGLSDYSGLQEELOELRERYHFLNEEYRALQESNS 
SLTG0LADL5SERT0R-ATERWLQSQTLSMTSAE SOTS EMDFLE P 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELOKHRQVSEE 
EORRU)RELKCAONEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWISALLWCWWAETSS 


5884 


4261 


252i 


GVLARAS ARLRV PLTGVRACAEPEVGAE PAKVAGAAE PDEBGGR 
SRLRDCGDYTPSERLGPKGAMLWFOGAIPAAIATAKRSGAVFVV 
FVAGDDEOSTOMAASWEDDKVTEASSNSFVAIKIDTKSEACLOF 
SQIYPWCVPSSFFIGDSGIPLEVIAGSVSADELVTRIHKVRQM 
HLLKSETS VANG SQ S ES S VST PS AS FE PNNTC ENS 0 S RNAEL CE 
IPSTSDTKSDTA'J'GGESAGHATSSQEPSGCSDQRPAEDLNIRVE 
RLTKKT .EERREEKRKEEEQREIKKEI ERRKTGKEMLDYKRKQEE 

EVEAAKAAAbLAKQAEM EVKRES YARERSTVAR 10FRL PDGSSF 
TNQFPSDAPLEE AROFAAQTVGNTYGNFS LATMFPR RE FTKEDY 
KKKLLDLEIiAPSASWLLP/AI»FINF*AGRPTASIVKSSSGD2W 
TIiTifiTVT^YPFT .AT WRI»T ^NRT.F^NPPPTOT^VRX/T^FPPNPA^ 
SSKSEKREP VR KRVLE KRGDDF KKEG K I YRLRTQDDG E DENNTW 
NGNSTQQtf 


5885 


900 


467 


AAGGGR R S R L SR S WPTGP S KS PSG VR CCG \ R R \ AWED KD EFLDV ; 
IYWFROIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 1 
YLO I DE EEYGGTWELTKEGFMTS FA/ 1 VHGHLDHLLH CH PL* LM 1 
VYSSQVLPIQSKGPS j 


5886 


fife 


1341 


P FRGRALTL KKQ P RPG VAP P S LGT CH KS D PGR PAAQ S C P PS PGS 1 
GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 1 
EVLLEALFLTVDPY^VAAK^LKEGDTMMGQQVAKVVESKNVAIi 
PKGTI VIiAS PGWTTHS I SDGKDLEKLLTEWPDT I PLSLALGTVG 
MPGLTAYFGLLE I CGVKGGETVMVNAAAGAVGS VVGQI AKLKGC 
KWGAVGSDEKVAYL0KU3FDWFNYKTVESLEETLKKASPDGY 
DCYFDNVGGEFSNTVIGQMKKFGR1AICGA1STYNRTGPLPPGP 
P PEI G I YOELRMEAF WYRWQGDARQKALKDLLKWV1.ELPYFV I 
D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKT1VKA 


5887 


1937 


104 


APGCRGCRATRCPCRGPRWDSI/3DEAARSPAAPGGAPGLLGLRE 
RPDRCKPGGDDRGPQLHRGSPG/SPSELSRRPGPPGLPGLQGPP 
PAPGLPQSRTL/ PVLCVCDLS PAQCD1 NCCCDFDCS S VDFSVFS 
ACSVP WTG DSO FCS QKAV I Y S LNFTAN PPQRVFELV DQ I NP S I 
FCIHITN\*NLHYPLLIOKYL/NENNFDTLKKTSDGFTLNAESY 
VSFTTKLDIPrAAKYEYGVPLQTSDSFLRFPSSLTSSLCTDNNP 
AAFLVNQA V KCTR K I NLEQCEE I EALSMAFYSS PE I LRV PDSRK 
KVPITVQS I VI QSLN KTLTRREDTDVLQPTLVNAGHFS L CVNW 
LEVKYSLTYTDAGEVTKADLSFVLGTVSSVWPLQQKFEIHFLQ 
ENTQPVPLSGNPGYWGLPLAAGFQPHKGSGIIOTTNRYGQLTI 
LHSTrEODCuALEGVRTPVLFGYTMQSGCKLRLTGALPCQLVAQ 
KVKS LLWGOGFPDYVAPFGNSQGP /ADMLDWVP IHFITQS FNRK 
DS COLPGALVI EVKVITKYGSLLNPQAKI VNVTANL3 SSSFPEAN 
SGNERTI L 1 S TAVTFVDVS APAEAG FRAPPAINARLF FNFFFPF 
V 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location j 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seament containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, 1 »Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion} 


5868 


375 


2302 


LLCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG 
LELKPDYKTWGPECVCSFLRRGGFEEPVLLKNIRENEITGALLP 
CLDESRFENLGVSSLGERKKLLSYIQRLVQIHVDTKKVINDPIH 
GK I ELHP LLVR 1 1 DTP Q FQRLR Y I KQ LGGG Y Y V F PG AS KNR FEH 
SLG VG YLAG CLVHALG EKQ PELQ I S ERDVLC VQ I AGLC HDLGHG 
PFSHMFDGRFIPIARPEVKWTKEQGSVMMFEHLINSNGIKPVME 
QYGL1PEEDICFIKEQIVGPLESPVEDSLWPYKGRFENKSFLYE 
I VSNKRNG3 DVDKWDYFARDCHHLGIQNNFDYKR F I KFARVCEV 
DNELRI CARDKEVGNLYDMFHTRNSLHRRAYQHKVGNI IDTMIT 
DAFLKADDYIEITGAGGKKYRISTAIDDMEAYTKLTDNIFLEIL 
YS TDP KLKDARE I LKQ 1 E YRNLFKYVGETQPTGO 1 XI KR ED YES 
LPKEVAS AKP KVLLDVKLKAEDF I VDVINMDYGMOEKNP I DHVS 
FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\LIRVYCKKVDRKS 
LYA\AROYFVOW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKIPTRLPRRLPKSRV\OLFKDDPM 


5889 


183J 


731 


LPAACGRPVTARPRCAPEGRSGRPRDL3PYPPQVFPPRPDRVAI 
VTGGTDG1GYSTAKHLARLGMHVI I AGNNDSKAKQWSKI KEET 
LNDKET*VLLCCPGWLCLWN5SDPP?SASRGAGTTGVHHHFLLK 
FG 1 FI L\ DIASMTS I R QF VOKFKMKKI PLHVL1 NNAGVMMV POR 
KTRDGFEEKFGLNYU5HFLLTNLI>LDTIiKESGSPGHSARWTVS 
SATHY VAE LNMDDL0 S S ACY S PHAAYAQS KLALVLFTYKLQRLL 
AAEGS^TAJ^JVVDPGVVNTDLYKHVFWATRLAKICLLGWLLFKTP 
DEGAWTS I YAAVTPELEG VGGRYLYNKKETKSLKVTYNQKLQQQ 
LWS KS CEMTGVLDVTL 


5890 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLOSGTEAACRS 
GR P D PR P AS AAGGHAG ERMS QRDTLVHL FAGGCGGTVGA I LT CP 
LEWKTRLQSSS VTLY I S EVQLNTMAGASVNRWSPGPLHCLXV 
I LE KEGPR S LFRGLG FNLVGVAPSRAI YTAAYSN CKEKLNDV FD 
PDSTQVHM1SAAMAGFTAITATNPIWLIKTRLQL* / SOGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFV1YES1 
KQKLL E Y KTASTMENDE ES VKE ASD F VGMMLAAATS K\ LVATT I 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQJ P\NTAIMMATYELWYLLNG 


5891 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLOSGTEAACRS 
GRPP PRP AS AAGGHAG ERMS QRDTLVHLFAG GCGGT VGA I T ,TCP 
LEW KT RLOS S S VTLY I S E VQLNTMAGAS VN RWS PGPLHC L KV 
I LEKEG PR S LFRGLG PNLVGVAPSRAI YFAAYSNCKEKLNDVFD 
PDSTOVHMI SAAMAGFTAITATNPIKLIKTRLQL* / SOGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFV1YESI 
KQKLLEYKTASTMENDEESVKEASDFVGMMLAAATSK\LVATTI 
AYPKEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQI P \ NTA I MMAT Y E LWYLLNG 


5892 


1764 


379 


wlrvcgrls vnsavs s rtggws agltcamorlowlghlrgpa 
dsgwkpoaapclsgap:^saadvvvvhgrrtaicragrggfkdt 
tpdel-lsavmtavlkdvktlrpeqlgdicvgnvlqpgagaimari 
aqflsd i petvplstvnrqcssglqavas i aggi rngs ydigma 
cgvesms ladrgnpgn i tsrlmekekardcl i pmg 1 tsenvaer 

FGISREKODTFALASQQKAARAQSKGCFQAEIVPVTTTVHDDKG 
TKRS ITVTQDEG IR PSTTMEGLAKLKPAFKKDGSTTAGNS SQVS 
DGAAAILLARRSKAEELGLP1LGVLRSYAWGVPPDIMGIGPAY 
AI PVALOKAGLTVSDVDI FEINE\AFASQAAYCVEKLRLPP* EG 
+ TPLGGASGP * GHPLGLHWGKVQVITLAQ * S * SARGKRAYRSGC 
PCA1GSWNGSPLPVFEYPWGT 


5893 


3 


1653 


ILSKRRCQKAKTKELMAKKVAVIGAGVSGLISLKCCVEEGLEPT 
CFERTEDIGGVWRFKENVEDGRASIYQSWTNTSKEMSCFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKi i.QFQl i V0j2>vkklf 
DFSSSGOWKWTQSNGKEQS AVFDAVMVCSGHH I LPHI PLKS FP 
GMERFKGOYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
KAA0VFI STRHGTWVMSR ISEDGYPWDSVFHTRFRSKLRNVLPR 
TAVKWM I EQQMNRWFNK ENYGLEPQNKYI MKEPVLNDDVP SRLL 
CGAI XVKSTVXELTETSAIFEDGTVEENIDV1 1 FATGYSFSFPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotidt 
locat ion 
corresponding 
to first; 
amino acid 
residue oi 
amino ocic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Lenc.ine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T^Threonine , V=Valinc, 
W«=Tryptophan, YtsTyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion: 








LEDSLVKVENNMVSLYKYIFPAHLDKSTLACIGL10PLGSIFPT 
AELQARWVTRVFKGLCSLPSERTMMMDIIKRNEKRIDLFGESQS 
OTLQTNyVDYLDELALEIGAKPDFCSLLFKDPFCLAVRLYFGPCN 
S Y * YRLVG PGQWEGARNA I FTQKQR I LKP L KTRALKDS SNFS VS 
FLLKI LGLLAWVAFF\ CQLQWS 


5894 


174 


1673 


RYS ? KKVLQN KESSLKLGMATALVSAHS LA F LN LK KEGLR WRE 
DH Y S TWEOG FXLCGNS KGLGQ EP L CKQFR C Y E E TTG P REALS 
RLRELCGQWLQPETHTKEHILELLVLEQFLl 1 1 j P KELQARVQEH 
KPESREDVVVVLEDLQLDLGETGQQVDPDCPXKQK1LVEEMAPL 
KG VQ EQQ VRHECEVTKPEKE KG E E TR I ENG K L I WTDS CGRVES 
SGK3 SEPMEAHNTFXSSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LI EHASTHTGKKLCESDVCQSSSLTGHKKVLS * ERKVIQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSONAGLLEHLR 
1HTGEKPYLCIHCGKNFRRSSKLNRHQR1HS0EEPCECKECGKT 
FSOALLLTHHQRIHSHSKSHOCNECGKAFSLTSDLIRHHRIHTG 
EKPFKCN I CQKAFRLKSHLAOHVR1 HNEEKPYOCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5855 


2967 


86 


HPS L LG AI PFYPPPSSPWPPPLYL FWNSKR K S R H F I NQRG I HGE 
KRLFVSDG\TGCLFVLAAAGRARGRAEVL3S?VGPEDCWPFLT 

r pk v p vlq ldsgnylfsts ai cry ff\ lls g w e 0 ddltnq wle w 
eatelqptlsaalyyl\wqgkkg\edvlg5vrrtlthidhsls 
roNncpflagetesladivlwgalypllodpaylpeelsalhsw 
fqtlstq\epcqr\aarrlvlkq\ogvlalr \ pylqkqpopspa 
egkg ls p i epeeeelatlseee i amavtaw e kg leslp plrpqq 
npvlpvagernvlitsalp yvknvphlgn 1 3 gcvlsadvfarys 
rlrcwntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 
diy\rwfnisfdifgrtttpqq\tkit\odifqollkrgfvlqd 
tveq lr ceh carf\ladrf veg vc p fcg ye e ar gdqcd kcg kli 
navelkkpqckvcrscpwqssohlfldlpklekrleewlgrtl 
pgsdwtpnaqfitpffgfrewpskprwq*trdlk\wgnpgtp*e 
gfedk\ vfyvwfdatigylsitan ytdqweh ww \ knpeqvdlyq 
fm \ a kdxvpfhs lvfpssalgaedn ytl \vs hl 1 ateyln yedg 
k\fsksrgvgvfrdm\ahdtgippuisrfyl\lyirpegk\dsa 

F^WTDT.T.T.yNMQV FT.T.NNI^NFTNRAXGMFV^KFFGGV YVPEMV 

ltp dd0rlla\ hvtlelqiiyi iq\ ll ekvr 1 h d alrs 1 lt i s \ rh 
gnqy 1 \qvnepw\ kr ikgseadrqragtvtglavni aallsvml 
qpymp tvs at i qaqlqlp p pacs 1 lltnflctl paghqi gtvs p 
lfoklendqieslrqrfgggqaktspkpawetvttakpoqiqa 
lmde vtkqgni vrelkaqkadkne vaasvak lldlkkolavaeg 
kppeapkgkkkjc 


5896 


2S67 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MR L FV S DG VPGCLP VLAAAGRAR GRAE VLI S T VG PEDCWP FLT 
RPKVFVLQLDSGNYLFSTSAICRYFF\LLSGKEQDDLTNQWLEW 
EATELQ PTLS AALYYL\ VVQG KKG \EDVLGSVRRTLTH I DHSLS 
RO\NCPFLAGETESLADIVLWGALYPLLQDFAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\OGVLALR\PYLQKQPQPSPA 
E3KGLS PI EPEEEELATLSEEEI AMA VTAWE KG LES LPPLR PQQ 
NPVLFVAGERtTVLITSALPYVNI^PHLGNI 3 GCVLSADVFARYS 
RLROWNTLYLCGTDEYGTATETKAL\EEGLTPOEI CDKYHI IHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFOOLLKRGFVLQD 
TVEQLR CEHCAR F\ LADRFVEG VCP FCG Y EEARGDQCDKCGKL I 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRL5EWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRELK\WGNPGTP*E 
GFEDK \ VFYVW FDATIGYLS I TANYTDQWER WW \ KNPEQVDLYQ 
FM\AKDNVPFKSLVFPSSALGAEDNYTL\VSHLIATEyLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS \ELLNNLGNF INRA\GMFVS K FFGG\ YVPEMV 
LTPDDORLLA\HVTLELQKYHQ\ LLEKVRI RDALRS I LTI S \RH 
GNQY I \QVNEPW\ KR I KGS EADRQRAGTVTGLAVN I AALLSVML 
QP YM P TVSATI QAQLQLPP PACS I L LTN FLCTL PAG HQI GTVS P 
LFQKLENDQIESLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=?henylaianine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«= Threonine, V*=valine, 
VJ= Tryptophan, Y»Tyrosine, X~Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMDEVTKOGNIVRELKAQKADKNEVAAEVAKLbDLKKQLAVAEG 
KPPEAPKGXKKK 


5897 


2967 


86 


HP SLLGA I PFYP PPSS PWPPPLYLFWNSHRKSRH FINQR6I KGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFliT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATEbCPTLSAALYYIiXWQGKKGN EDVLGS VRRTLTHI DHShS 
RQ \NCP FLAG ETE S LAD I VLWGAL Y PLLQD P AYLPEE LS ALHS W 
FQTLSTQ \EPCQR \ AARR LVLKQ \ QGVLALR \ PYLQKQPQP £ PA 
EGXGLS PIE PEEEELATLSEEE I AMAVTAWEXGLESLPPLRPQQ 
NPVLPVAGERNVLI TSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEI CDKYHI II IA 
DI Y \ RWFN I S FDIFGRTTTPQQ\TKI T \QD I FQQLLXRG FVLQD 
TVEOLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVSLXKPOCKVCRSCPVV0SSQHLFLDLPKLEKRLEEWLGRTL 
PGS DWT PNAQ FI TP FFGFRE W PS K P RWQ * TRDLK \ WGN PGT P * E 
GFEDX\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\AXDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLL KNNS \ ELLNNLGNFINRA\GMFVS KFFGG \ YVPEMV 
LTPDDCRLLA\HVTLEXiQHYHQ\LLEKVRIRDALRSH,TIS\RH 
GNQYI \QVNEPW\KRI KGSEADRQRAGTVTGLAVNI AALLSVML 
QP YMPTVS ATIQAQLQLPPP ACS ILLTNFLCTLPAGHQI GTVS P 
LFQKLENDQIESLRQRrGGGQAXTSPKPAVVETVTTAKPQQIQA 
LMDEVTKOGNIVRELKAQKADKNEVAAEVAKT.T..DLKKQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


B6 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRXSRHFINQRGIHGE 
MRLFVSDG VPGCLPVLAAAGRARGRAEVLI STVGPEDCWPFIiT 
RP KVP VLQLDSGN YLFSTSA 1 CRY FF\ LLS GWEQDDLTNQWLEW 
EATELQPTLSAALYYL \ WQGKKG \EDVLGS VRRTLTHI DHSLS 
RO\NCPFI^AGETES LADI VLWGALY PLLQD PAYLPEELS ALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 

egkglspjepeeeelatlseeeiamavtawekgleslpplrpqq 
npvlpvagernvli tsalpyvnnvphlgni igcvlsadvfarys 
rlrowntlylcgtdeygtatetkal\eegltpqeicdkyhiiha 
diyVrwfnisfdifgrtttpqqXtkitNqdifqollkrgfvlqd 
tveqlrcehcarf\ladrfvegvcpfcgyeeargdqcdxcgkli 
navelkxpqcxvcrscpwqssqhlfldlpxlexrlbewlgrtl 
pgsdwtpnaqfitpffgfre1\'pskprwq*trdlk\wgnpgtp*e 
gfedk\vfyvwfdatigylsitanytdowerww\kkpeqvdlyq 
fm\akdnvpfhslvfpssalgaednytl\vshliateylkyedg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpegk\dsa 
fs wtdlll knns \ellnnlgn finra\gm fvs kffgg \ yvpemv 
ltpddqrlla\hvtlelqhyhq\llexvr I RDALRS iltis \rh 
gnqyi \qvnepw\krixgseadrqragtvtg1jwniaallsvkl 
qpymptvsati qaqlqlpppacs illtnflctlpageqigtvsp 
lfqklendqieslrqrfgggqaktspkpawetvttakpqoiqa 
lmdevtkognivrelxaqxadknevaaevaklldlkkqlavaeg 
xppeapxgkxkk 


5895 


326 


1078 


ntpxskepngvpj^slpsplraamalsdvdvxkqikkmmafieq 
eanexaeeidakaeeefniekgrlvqtorlkimeyyekkekqie 
qqkki lmstmrnqarlkvlrarotli sdllseaklrlsri vedp 
evyoglldklvlqgllrllepvmivrcrp\odlllveaavokai 
peymtisokhvev\qidkea*lavecswewjevysgnqrixvsn 
tlesrldlsakqkmpeirmalfgant>jrkff: 


5900 


64 


1409 


kaasrdspclefcplcgvsshdlqhrmwyhrlshlhsrlqdllk 
ggv i y pal pq pn fksllplavhwhht as ks ltcawqohedh fel 
xyantvmrfdyvwlrjdhcrsascynskthorsldtasvdlcikp 
ktirldettlfftwpdghvtkydlnwlvknsyegqkqkvjqpri 
lwnaeiyooaqvpsvdcqsfletneglxkflonfllygiafven 
vpptoehtexlaerisliretiygrmwyftsdfsrgdtaytkla 
ldrhtdttyfqepcg1qvfhclkhegtggrtllvdgfyaaeovl 
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SEO 
ID 
NO: 

t 


Predicted 
beginning 
nucleotide 

T nra h i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

rorrpcnnnHi no 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 

L=Leucine, M= Methionine, NsAscaragine, 
?=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T= Threonine, VeValinc, 
W=Tryptoohan , Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








OKAFEEFELLSKSAI\KHEYIEDVGECHQFHDWDWAQS*ISTHG 
/YKELYLIRyNNYDRAVINTVPYDWHRWyTAHRTLTIELRRPE 
NEFWVKLKPGRVLFIDKTVIRVLHGRECFTGYROLCGC^LTRDDVL 
NTARLLGLQA 


5901 




212; 


VAI EQTSLKMiy OAVGGAPARPTG EY I CNQCGAKYTSLDSFQTHL* 
KTHLDTVL? KLTCPQCNKE FPNQESLLKHVTI HFMITST YY I CE 
SCDKQFTSVDDLOKHLLDMHTFVFFRCTLCQEVFDSKVSIQLHL 
\AVKHSNEKKVYKCTSCKnCDFPJaETDLQLHVKHNHLENOGKVHK 
CI FCGES FGTEVE LQCHI TTHS KK YNCKFCS KA FHAI ILLEKHL 
R E KHCV FE T KTPN CGTNG A S E QVQ KEE V E LQTLLTNS QESHNSH 
DGSEEDVDTSEPMYGCDICGAAYTMETLLQNHQLRDHNIRPGES 
AIVEQCKAELIKGNYKCNVCSRTFFSENGLREH^THLGPVKKYM 
CPICGERFPSLLTLTEHKVTHSKSLDTGNCRICKMPLQSEEEFL 
EHCQMKPDLRNSLTGFRCWCMQTVTSTLELKIHGTFHMQKTGN 

CAGCVKLSKSASPGI NVPPGTNRPGLGQNENLSA I EGKGKVGGL 
KTRCS*LATFKF'*\TjKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
OVSPMPRISPSQSDEKKTYOCIKCQMVFYNEWDIOVHVANHKID 

FVQANKLOQH I FS AHGQEDKI YDCTOCPQKFFFQTELQNHTMTQ 

HSS | 


5902 


712 


20S 


LKNRRRSRPS I ROS I GSTS VSRWLTS LFTY LDHTADVQ * V* REF 
1 PLXPRQ* ED * MFQSWLHAWGDTLEEAFEQCAMAKFGYMTDTGT 
VE PLQTVEVETQGDDLQSLLFH FLDEWL YKFSADE FFI P \ GWGE 
EFSLSKKPQGTEVKAITYSAMOVYNEENPEVFVI I DI 


5903 


2106 


73 £ 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLOGRGLPTT 
PALFALSAVPGGAASPMPPSGLRLbPLLLPLLWLLVIiTPGRPAA 
G LS TCKT I DM ELV KR KR I E A I RGQI LS KLRLAS PPSQG E VPPGP 
LPEAVLALYNSTRDRVAGESAEPEPEPEADYYAKEVTRVLMVET 
HNEI YDKFKQSTHSI YMFFNTSELR^AVPtPVJjljSKRbijKljijKl» 
KLKVEQHVEL YO K Y S NN S WR YLS NRLLAPS DS P E WLS FD VTG W 
ROWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
AT I HGMNR P FLLLMAT PLE RAQHLQS \ S RHRQAL \ DTN Y\ CFSF 
KGGRNCLRC/ VHC* HL1 FR KDL\GW\ KWI \HE \ P KGYHANFC\L 
G PCP YI WSLDTQYSKVLALYNQ\HKPG\ASAAP \ CCVPQALEP\ 
LPIVYY\VGRK?KVEQLSNMIVRSCKCS 


S904 


3 


1126 


MMEEIENAIWTrKEEQRLIYEELIKEEKTTNWELSAISRKIDTW 
ALGNSETE KA FRA 1 S S KVP VDKVTPS TLP EE VLD FEKFLQQTGG 
x?r\H'h wrinvnuoMirwvtrDM vuvr v DTPMPi?\n .put .Df^KTonPUriTl 

KybAnUUX DnyWr Viv VKH JW\\3 Air 1 f FlCtCi V Jj£»tXLirV9ll J. \lLfCt VW 

HE KWYQKFLAL.EERKKES I QI WKTKKQQKREEI FKLKEKADNTP 
VLFHNKQEDNQKOK E EQRKK0KLAVE AWKKQKS I EMSMKCASQL 
KEEEEKEKKHOKER0RQFKLKXLLESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADEISRF0ERDLHKLELKILDROAKEDEKSQ 
KQPJILAKLKEKVENNVSRDPSRLY/NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MASFPPRVNEKEIVRLRTIGELLAPAAPFDKKCGREXWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLONFLLHGTKNVTNSSSLRL?R 
ONSDGGQKNKPREHIIDCGDIVWELAFGSSVPEKQSRCVNIEWH 
R FR FGQDQLL LATGLNSGR I KI WDVYTGKXLLNLVDHTGWRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNVrVY\ 
S CA FS PDSS MLCS VGAS KA WAA I LV * LR LCKHHSHT3 ATM VLS 
WAERVASLATGLGATFTIG*SNLAFVLQGVLYVHRCWSMSTFCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNGFSVLFFGI LSDSRDI LRL* FNLKFVLI FF * K* CIVSVQK 
KKKPKRIALLQEERLS*DKPPSSHLI*QTEVNIRILFRAILHS* 
LLIFRI*NCI*TYS*IIDPFYIOMTYDRG*FGKNKMVKF*FIEM 
*LYYFHKIAFSFCNVV*KPCCLPKKFHLAVNII*FACSICFSS*A 
QVGD PS Uj* TSD Y L KGRCQW SNNLLTLR FLS VY F FKNLWSGKK 
REGGL* YliTLFl S V YFS * LVFGINGFOYS FWKLHCLYFMFRLI 
FKLT FN RN1 * NR I CMSAL I NL KTDFNLTMTLS IFF KLLI IYNA* 
YNLN* I * QF* YKMCHFVX>CMSE* SYNI CLFI AGF\LWNMDKYTM 
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SEQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A=A3anine, C=Cysteine, D=Aspartic Acid, E* | 
Glutamic Acid, ?=Phenylalanine , G=Glycine, 
K=Histidine, 3=3soleucine , K=Lysine, 
LsLeucine, H=Methicnine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=7ryptophan, Y=Tyros ine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I RKLEGHHKD VVACDFSPDGAXLATAS Y DTRVY I WDPHNGDI LM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGIjHVASLADDKMVR 
FKR I DEDY PVQVAPLSNGLCCAFSTDGSVLAAGTKDGS VYFWAT 
PROVPSLQHLCRMS I RRVMPTQEVOELP 3 PS 3CLLEFLS YRI 


5906 1 


146 


2036 


REGAGSGRMASGA\YNPYIEI 1EQPRQRGMRFRYKCEGRSAGSI 
PG EH STDNNRTY PS I Q I MNYY G KGKV\R 3 TLVTK \NDP Y KPH PH 
DLVGKDCRD\GYYEAEFGQE\RRP\LFFQN\LGIRCVKKKEVKE 
A\ I ITR\ I KAGINPFDVP* KQLNDI EDCDLDWRLWFRVFLPDG 
HGKI*\TTALP?V\VSSPiyDNRAPNTAELRVCRVNKNCGSVRGG 
DE 1 FLLCD KVQKDD I E VR F VLN DW E AKG 1 FSQADVHRQVA1VFK 
TP PYCKAI TEPVTVKMOLRRPSD0EVSESMDFR YLPDE KDTYGN 
KAKKQKTTLLFQKLCCDHVETGFRHVDQDGLELLTSGDPPTLAS 
QS AG I TVN F PER PRPGLLGS I GEGR Y FKKE PNL FSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSKHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSSFSTRTLPSNSQG 3 PPFLRI PVGNDLNASNAC I YNN 
ADDIVGMEASSMPSADLYG1SDPNMLSNCSVNMMTTSSDSMGET 
DNPRLbSMNLENPSCNSVIjDPRDLROLHQMSSSSMSAGANSNTr 
VFVSQSDAFEGSDFSCADNSM3NESGPSNSTNPNSHVFVQUSQY 
SG3GSMQNEQLSDSFPYEFFQV 


5907 


95 


1873 


TYLLSSWSS* *NLDTKIKSQVKV/RKGHKK2SWPYPQPAKQNGK 
KATS K V PS A PH FVHPNDHANREA ELK K KWVE EMRE KQQAAREQE 
R0KRRTIESYCQDVLKR0EEFEKKEEVLOELNMFPQLDDEATRK 
AYYKEFRKVVEYSDV3LEVLDARDPLGCRCFQMEEAVLRAQGNK 
KLVLVLNKIDLVPKEWEKWIjDYLRNELPTVAFKASTOHOVKNIj 
NRCSVPVDQASESLLKSKACFGAENLMRVIjGNYCRIjGEVRTHIR 
VG WGL PNVG KS S LI N S LKR S RACS VG AVPG3 T K FMQE VY LD KF 
IKI.bDAPGIVPGPNSEVGTILRNCVHVQKliADPVTPVETILQRC 
NLE EI SN YYGVS GFQTTEH FLTAVAHRLGKKKKGGLYSQEQAAK 
AVLADWSGKISFYIPPPATHTLPTHLSAEIVKEMTEVFDIEDT 
ECANEDTMECLiATGESDELIjGDTDPLEMEIKLLHSPMTKIADAI 
ENKTTVYK3GDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
S VDRRSVLQR 1 METDPLQQGQAliAS ALKNKKKMQ KRADK I AS KL 
SDS MMSALDLS GNADDG VGD 


5908 


247 


975 


HCG 3 KKRG EGSGS PS PASGGFQLGCQ I PEPSLPSEEETHPHTRA 
HTRTLRATLTRRPPRSHSTRLRFPMPLDGDGGIaASWK/PMRER* 

gwrrpakaagaslgvaatgkrgcrmskrylqkatkgkll3 3 ifi 
vtlwgkwssanhhkahwktgtcevvalhrccnknkieersqt 
vkcs cfpgqvagttraapscvdas3 veqkwwchmqpclegeeck 
vlfdrkgwscssgnkvkttrvth 


5909 


1 


5002 


PA3 PGSTI 1WAPGSHSAARAJX5RHGSLPS0SQAPGALCGARAPP 
SSNLRADRSM3 CAQARAGKNLY3^FLGLAAMAFPSRNSQS LRR 
CKEP3 R YS YNPDQFHNMDLRGG PHDGVTI PRSTSDTDLVTS DSR 
S TLMGRS S Y Y S 3 GHS QDLV I H WD 3 KE E VDAGDW 3 GM YL I DE VLS 
ENFLDYKNRGVNG SHRGQ1 3 WKI DAS S YFVEPETKI CFKYYHGV 
SGALRATTPSVTVKNSAAPI FKSIGADETVQGOGSRRliI SFSLS 
DFQAMGLKKGMFFNPDPYLKIS 3QPGKHS IFPALPHHGQERRSK 
I IGNTVNP1WQAEQFSFVSLPTDVLE1 EVKDKFAKSRPI IKRFL 
GXLSKPVQRLLERHAIGDRWSYTLGRRljPTDHVSGQLQFRFEI 
TSS3HPDDESISLSTEFESAQI0DSPMNNLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQS 3 ELSRPAEEAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEEIiAEQLDLGEEASALLLE 
DGETiPASTKEEPLEEEATTOSRAGREEEEKEQEEEGDVSTLEQG 
EGRLQLRASVKRKSRPCSLPVSELETV3ASACGDPETPRTHYIR 
1 HTLLHS M PS AQGGS AAEEEDGAE EEST LKDS S EKDGLS E VDTV 
AAD P S ALE EDR EE P EG ATPGTAH PGHSGGH FPS LANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCBGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDE1AAPSGHVER 
SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 
I DE PLPFNVIE ARI DSHGRVFYVDHVNRTTTWQRPTAAATPDGKR 
RSGS 1QQMEQLNRRYQN3 QRTIATERSEEDSGS0SCE0APAGGG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence * 


Predicted end 
nucleotide 
location 
corresponding 
to firel 
amino acid 
residue oi 
amino acid 
sequence 


Amino acid segment containing signal peptide •; 
(A=Aianine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l«=3soleucine, K=Lysine, 1 
L»Leucine, M^Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, 7=Threonine, V=valine, 
VJ=Tryptophan, Y=Tyrosine, X=Unknown, *=£top 
Codon, /=poss.ib3e nucleotide deletion, 
\=possible nucleotide insertion) 








GGGGSDSEABSSQSSLDLRREGSLSPVNSQKITLLLQSPAVKFl 
TNPEFFTVLHANySAyRVFTSSTCLKHMILXVRRDARNFERyOH 
NRDLVNFINMFADTRLELPRGWEIKTDQOGKSFFVDHNSRATTF 
IDPRIPLQNGRLPNKLTHRQHLORLRSYSAGEASEVSRNRGASL 
LAR PGHSLVAAI RSQHQHESLPLAYNDKI VAFLRQPNI FEMLQE 
ROPSLARNHTLREKI KY I RTEGNHGLE KLSCDADLVI LLSLFEE 
EIMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 
YRRDFEAKLRNFYRKLEAKGFGCGPGKIKL1 1 RRDHLLEGTFNO 
VMAYSRKELORfJKLYVTFVGEEGLDySGPSRBFFFLLSOELFNP ; 
YYGLFEYSANDTYTVOISPMSAFVENHLEWFRFSGRlLG\liAbI 1 
HO Y LLDAFFT\ R P F Y KALL \ R LPC \ D \ LSDLE YLDKE F HQS LQW 
MKDNNITDILDLTFTVKEEVFGQVTERELKSGGANTQVTEKNKK 
EYJ ERM VKWRVERGWCOTEALVRGFYEWDSRL VSVFDARELE 
IjVIAGTAEIDLNDWRNNTEYRGGYHDGHItVIRWFWAAVERFNNE | 
ORLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRFLP*KKWGKITS 
LPPRG \HTCLQPD WDLPT VS PRTPMLYEK\LLTA\ VEETSTFGT 


5910 


1526 


446 


VAE FAAMEPGRTQ I K LDPRY TADLLE VIjKTN YG I PSACFSQPPT 
AAQL LRALGPVE L ALTS I LTLLALGS 1 AI FLEDAVYLYKNTLCP 
I KRRTLLWKSSAPTWSVLCCFGLWI PRS LVLVE M T I TS FYAVC 
FYLLMLVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTR K KLQ \ R * C W ALS NTPS * R * R * P WWACFS S P TAS MTQQTFL 
RGAC LYG S TLS S A/ C S TL-L ALWTLG IIS RQARLHLG EQNMG AKF 
ALFQVLLILTALQPSI FS VLANGGQ 1 ACS PPYSSKTR3QVMNCH 
LLI LETFLMT VLTRK Y YRRKDII KVGYETFS SPDLDLNLKALRWM 
AWTMKGCCTH 


5911 


109 


595 


QLPLAPC I QGKGIiEMRS PKPQS FI IRS SHSGAGLLV KNPSTPVF 
CGHRRGGAAFKYKPTPWGPEORPTGOKHMRGGVSLLSPRLECS 
GTI SAHCNLRLPSSSNSPAPAS * LAG I TGVCHHAQLl FVFLVET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5912 


924 


277 


MILNK^MLGALALTTVMSPCGGEDIVADHVASYGVNLYQSYGP 
SGQYSKEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPOFALTN 
IAVLKHNLNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNITWLSNGHSVTEGVSETRPSSPKSDHFI.TiODQ 
VTS PSFPFE** DL* TAXVEQLGAWFEPLLKHWGAEI PTTL 


5913 


46 


1196 


QLRMAGAEGAAGROSELEPWSLVDVLEEDEELENEACAVLGGS 
DSEKCSYSQGSVKROALYACSTCTPEGEEPAGICLACSYECHGS 
HKL FELYTKRN FR CDCGNS KFKNLECKLLPDKAKVNSGNKYKDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFOEKVCQACMKRCSFLWAYAAOLiAVTKlST\GNMDWCGTLM 
E*/DDQEVIKPENGEHODSTLKEDVPEQGKDDVREVKVEQNSEP 
CAGS S S E SDLOTVFKNE S LNAE SKSG CKLQELKAKQL I KKDTAT 
YWPLNWRSKLCTCQDCMKMYGDLDVLFLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLSSMIORVOOVELI C/G IQ* FED 


5914 


960 


124 


NLGGS ELP PEEALF 3 QVASMNQRRVDFYLAS IEDMLVAI /GGRN 
ENGALSSVETYS PKTDS WSYVAGLPRFTYGHAGTI YKDFVYI SG 
GKDYQIGPYRKNL.bCYDHRTDWIEERRPMTTARGWHSMCSLGDS 
I YS I GGSDDN I ESMER FDVLGVEAYS PQCNQWTRVAPLLHANSE 
SGVAVWEGRI Y I LGGY S WENTAFS KTVQ V YDRE ADKWSRGVDLP 
KAT AGGSACFI AP* SLGORTRKRKAXARGTRTGASDPS CASMDH 
PHRH LPGLCRP AATS 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARIIQAPHCHSPRPRTCPPGALOAPEA 
PASRAEGPVAVWtfGHTEGPAPARSAPKEPPGLPRPLGSFPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLl.PAP/PGLPS 
PRELPGEEPSAHPVHOGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGI** * AAGPAAH 


. 5916 


" 256 


633 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSrKGTSGKGCR 
TVTG AVHRHLNHVAG 1 1 PWVLKSQLKPTAATAQDQWTSQQYPDH 
PTRL 1 1»0* NQATADKNTJ * TTALLQPHQRL \VSPRr4AEA 


5917 


1343 


827 


AHQ1 LTYLEP/ ICLVVNYNKIbTV FLTKSVLE1 * KF1HTPQT YR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence j 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=A3anine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histicine, I=Isoleucine, K=Lysine, 
LsLeucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unk:nown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








?*NDFFGIKEVyvSRRLRKTSF/RLAVTFLEOAWSKECVPVDQ 
FMEHLLPS LLSLAS D PVPNVRVLLAKALRQMLLEKAY FRNAGNP 
HLEVIEETI LALOSDRDQDVSFFAALEPKRRNI IDTAVLEKQN 


5918 


13 

! 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGRRRMETPFYGDEALSGLGC-GASGSGGTFASPGRLFPG 
A?PTAAAGSMMKKDALTLSLSEOVAAALKPAPAPASYPPA\ADG 
APSAAPPDGLLASPDLGLLKLASPELERLIIQSNGLVTTTPTSS 
QFLYPK\^AASEEOEFAEGFVKALEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGS AP PG ELAPAAAAPEAP V YA\NLS S Y \ AGGCRGL 
RGGAATWAFAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALERESEDPS*SPEHGSLASTASLLREQVAQLX 
QKVLSHVNSGCQLLPQHQVPAY 


5915 


I 


4254 


TSV0GDSOGTPTSSOGS INMEHW I SQA3HGSTTSTTSSSSTQSG 
GSGAAHRLADVMAQTHIENHSAPPDVTTYTSEHSIQVERPOGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVQQPDPNQPXPEGAQMLAMRGEOLGWTNW 
PPSLE AALQRWGT3 S P KAPCLTTMDTNGK PLY I LTYG K LWTRSM 
KVAYSILHKLGTKOEPMVRPGDRVAIiVFPNNDPAAFMAAFYGCL 
LAEWPVP 1 E VPLTRKDAGSQQIG FLLC-S CGVTVALTS DACHKG 
LPKS PTCEI PQFKGW P KLLWFVTES KHLS KP PRDWF\ PK I KDAN 
NDTAY3EYKTCK\DGSVLGVTVTRTAIXTHCQALTQACGYTEAE 
TI VNVLDFKKDVGLWHG I LTS VMNMMHV I S I PYSLMKVN PLS WI 
QKVCQYKAKVACVKSRDMHWALVAHRDQRDINLSSLRMLIVADG 
ANPWSISSCDAFLNVFQSKGLRQEVICPCASSPEALTVAIRRPT 
DDSNQPPGRGVLSMKGLTYGV I RVDSEEKLS VLTVQDVGLVMPG 
AI MCS VKPDG VPQLCRTDE I GELCV CAVATGTS Y YGLSG MTKNT 
FEVFAMTSSGAP I SEYPFI RTGLLGFVGPGGLVFWGKMDGLKV 
VSGRRHNADD3 VATALAVEPMKFVYRGRIAVFSVTVLHDER I VI 
VAEQRPDSTEEDS FCWMSRVLQAIDS IHQVGVYCLALVPANTLP 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 

feigpasvmvgnlvsgkriaqasgrdlgqiedndqarkflfl.se 

VLQWRAQTTPDH I LYTLLNCRGA I ANS LTCVQLHKRAEK I AVML 

mbrghlqdgdhvalvyppgidliaafygclyagcvpitvrpphp 
cn i attlptvkm i vevsrs aclmttqli ckllrsreaaaavdvr 
tkplildtdd*pkkrpaqickpcnpdtlayldfsvsttgmlagv 
kmshaatsafcrsiklqcelypsrevaicldpycglgfvlwclc 
svys gh q s i l 3 p ps ele tn palwllavs qykvrdt fcs y s vmel 
ctkglgsqteslkargldlsrvrtcvvvaeerprialtqsfskl 
fkdlglhpravstsfgcrvnlaiclqgtsgpdpttvyvdnralr 

HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 
LGEIV7VHSAKNASGYFTI YGDESLQSDHFNSRLSFGDTQTI WAR 
TGYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID 
I ETS V I RAHKS VTECA VFTWTNLLWWELDG SEQKALDLV PLV 
I'NVVLEEH YLI VG VVVVVDIGVI P I NSRGEKCRMHLRDG FLADQ 
LDPIYVAYNW 


5920 


1383 


1499 


QUSAVAHAGVSRI PP* LFPPLHPTFLSLWCLHHKLP/HPPGASM 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCSSAAHSTYRVQEPAVHIPGQEPLTASM 
LAAAPLHEOKQMIGERLYPLIHDVHTQLAGKITGMLLEIDNSEL 
LLMLt S PESLHAKI DEAVAVLQAHQAMEQP KAYMH 


5521 


727 


157 


VCPGTGGE*GLWGQLGGLPKETPLKPMDAFTGSGLKRKFDDVDV 
G3SVSNSDDEISSSDSADSCDSLNPPTTASFTPTSILKRQKQLR 
RKNVR FDQVTVY YFARRQG FTS VPSQGGS SLGMAQRHNS VRS YX 
LCEFAQEQEVNHREILREHLKEEKLHAKKWKLTKNGTVESVEAD 
GLTLDD V S DEDI DVENVEVDDYF FLQPLPTKRRRALLRASG VHR 
IDAEEKQELRA3 RLS R EECGCDCR LY CD PE ACACS QAG I KCQVD 
RMSFPCGCSRDGCGNMAGRIEFNPIRVRTHYLHTIMKLELESKR 
Q\GAAQ0PQ\ *GALPDCQLQPDRSTGL+ DPSWIGSKGLS FTGKG 
AAATHLI I LRVI ENRGAEGKRK 


5922 


2475 


495 


SYSNWGLFPSVFIQVPRSRTGNLKPIFLFYSYYE\CMETLKG\T 
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SEQ 
ID 
NO: 


Predictec 
beginning 
nucleotide 
jloco c i on 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino cid segment containing signal peptide 
(A=Ala:.:ne, C=Cysteine, D=Aspartic Acid, E=" 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histjcine, I=Isoleucine r K— Lysine, 
L=Leuc2ne, M=Methionine, N=Asparaqine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknowr,, *=5top 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 








CLYNATOY KVCSPRNDR PDACYNPSE PAATTVFE 3 RTGL1.LGDT 
SKI I TRT EEKEI PKQ1 TLRFDACAA I MS KKLE I GCGSLH * ERS * 
RVENKYVCH ESGVCKNCAY WPCV I* AT* KKNKN DSVYLQKGEAN 
PSCAAGHCNPLELI1TNPLDPKWKKGERVTLGINRTGLKPQVVI 
LI KGEVH KCS PXPVFQTFYEELNLPAPELLXXTKNLFLQLiAENV 
IFLLNGTE CYVRGGTTIGDRWFWEA*EIiVPTDPAPDIIPI * KAE 
ASNF* VLKTSI IRQYCIAREGKDFI I PVGKPNC J GQKLYNSTTK 
TIT* *DLNHTEKNPFSKFSKLKTA* AHAESH*DWTVPSGLY* IC 
RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKMGELLGFSVYASR 

e kkgi v 3 gn wkdnewprer 1 1 cy yg patwaqdg s wgyr/ t p/ vy 
mlnwiirl-oaileiisnetgraltvlawqetomrnaiyonrlal 

DYLLVAEGGVCRKFNLTNCCl>QINDG<5QVV^nVKDMTKliAHVP 
IQVWHAF^PESJjr GKWrPAIGGr KTLI VlsVJbJjV J. rlX CJ-iljLir'Lvlj 
PLLFOM 1 KG I VATLVHQKTS AHVNYMNHYRS J S QRDSKS EDESE 
NSH 


5923 


137 


63C 


QLCGRRGORFRTSI KRMHPI * RTCPNTNL/ 1 ILbSOENTQI RDL 
QQENR ELW 3 S LEEHQDALELI MSKYRKQNTtQIiM VAKKAVDAE PV 
LKAHOSH £ AE I ESQ I DR I CEMGEVMR KAVQVDDDO FCK IQEKLA 
QLELENKELRELLSISSESLQARKENSMDTASQAIX 


5924 


274 


2146 


EKGKVKDAGAEQWISI*SLSCKGSWETQ?SNHLNSLTPPTSVRRM 
PLITTVTLLKMVARHHKKLLCSKAPSTQLQQKI FLHSQKGI HHQ 
S VCM KLK PN TSH 1 1 S I LMGQ PMALVQ LETLAPLT 1 1 IQXFQTQD 
HMKFWKNLPLHSHHLTPSVPQT VI PKKTGSPE I XLX ITKT1 QNG 
RELFESS LCGDLLN EVQASE \Q* NQS I ESRKEKR KXSNKKDS SR 
SEERKSHX I PKLEPEEQNRPNERVDTVSEKPREEFVLKEGS PSS 
ANTI FCSNNGS VHW \ FKFQVGDLVWS K VGTYFWW PCM VS£ DPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYXGHKQyEE 
LLAEATKOAS?CHSEKQKI RKPRPQRERAQWDIGI AHAEKALXMT 
REERIEQYTFIYIDKQPEEALSQAXKSVASKTEVKKTRRFRSVI, 
NTQPEQTNAGEVASSLSSTEIRRilSORRHTSAEEEEPPPVKIAW 
KTAAARKSLPASITMHKGSLDLOKCNMSPWKIEOVFALONATG 
DGKFIDQFVYSTKGIGNKTEISVRGCDRLIISTPNORNEKPTQS 
VSSPEATSGSTGS\rEKKQCRRSIRTRSESEKSTEWPKKKIKKE 
QVGFLHVE£ 


5925 


216 


1911 


MMTAESREATGLSPQAAQEKDGIVIVKVEEEDEEDKMWGQDSTL 
ODTPPPDPEIFRORFRRFCYONTFGPREALSRLKELCHOWLRPE 
INTKEQILEbL.VLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 
DLE LDLSG GQ VPGQVHG PEMLAR GM V P LD P VQES S S FDLHH EAT 
OSHFKKSS R X PRLLQ SRALPAAH I PAPPHEGS PRDQAMAS ALFT 
ADSQAMVK I SDMAVSLI LEEWGCQNLARRNLSRDNRQENYGSAF 
PQGGEKRN E N EESTS KAETSED S AS RG E TTGRS OKEFG EKR DQ E 
GKTGERQQ KK PEEKTRKEKRDSGPAI GKDKKTITG ERG PREXGK. 
iylA»K£>rfc JbJ^oMr 1 X rbli vfivji Ki7XlKuJJ£»t.V3J\.l«r J. rvoooJjJ.rtni\. 
IIHTGEKPyECSECGKAF\SLNS\NLVLHORI\HTGEKPHECNE 
CGKAFSHSSr<?LILHQRIHSGEKPYECNECGKAFSCSSD\LTKHQ 
RIHTGEKPYECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 
\ AFTRSSTLTLHHR 1 HARERASEYS PASLDAFGAFLKSCV 


5926 


2 


233 


DRCLMLKOGS'QPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAJ-JEP 
SDSPHHTPVKPPPEHSAACPAPATCCPPPRSSMS 


5927 


4i4e 


1248 


KHFSKPGS 0 ALYQLKRPASGQNS I S VMPAQKI TKPAAKYGI PLA 
YKKYGDKKLHEKKPLQKHKQAKQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMKRQBKERLEFINRAREOG 
WRIWLSAGG? GEVKAPFLGSGGTIAPS SFSSRGOYEKYHAI FDQ 
MQQQRAEDKEAKWKREIYGRGLPERQXGOLAVERAKOVEEFLOR 
TCPFivMfWKX.T; RFC? WW (57 T.n>JT .naMYftfiRPSSSRGGKPRNKEEEV 
YLARLRQ1 R LQNFNERQQI KAKLRGE KKEANHSEGQEGSEEADM 
RRKK\IESL>^IANARAAVLKEOLERKRKEAYEREXKVWEEHLV 
AXGVXSSDVS PPLGOHETGGSPSKQOWRSVI SVTSALKEVGVDS 
SLTDTRETS E EMQKTNNA1SSKR E 1 LR RLNENLKAUEDEKGKQN 
LSDTFEINVKEDAXEHEKEKSVSSDRKXWEAGGQbVIPLDELTL 
DTS FSTTERH TVGEVI KLGPNGS PRRAWGKS PTDS VLKI LGEAE 
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SEQ 
ID 
NO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
air.ino acid 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
K=Histidine, I»Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, 0=Glutamine # R=Arginine, 
S=Serine, T=Threonine, VcValine, 
W= Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








L0L0TELLENT7IRSEISPEGEKYKPL1TGEKKVQCISHSINPS 
AIVDSPVE7KSFEFSEASP0MSLKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCTITDVWISEEKETKETQS7uORITIQEKEVSEDG 
VS S TVDQLS D I H I E PGTNDS QHS KCDVDK S VQ P E P FFH KWHS E 
HLNLVPOVOSVOCSPEESFAFRSHSHLPP>CWKNKNSLLIGLSTG 
LFDANNP KMLRTCSLPDLS KLFRTLMDVPT VG DVRQDNLEIDE I 
EDEKIKEGPSDSEDIVFEETDTDLQELOASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNKLEELRLELEQEMGFEKFFEVYEKIKAIHE 
DEDENI EICS KI VCKI LGNEHQHLYAKI LH LVMADGAYQEDNDE 


5928 


414 6 


1248 


KHFS XFG SQALY QL K R P AS GQNS I SVMPAQK I TKPAAKYG I PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRK2SEEAAR 
KRR LE F I EXE X XQ KDQ 3 1 SLMXAEQM XROE XE RL»ER I NRAREQG 
WRNVLSAGGSGEVKAPFLGSGGT3APSSFSSRGQYEHYHAIFDQ 
M000RAEDNEAKWKREIYGRGLPERQKG0LAVERAKOVEEFLQR 
KRE AMQNKARAEGKMG I LQNLAAMYGGRP SS SRGGKPRN K.EEEV 
Y LAR LRQ I R LQN FNERQQ I KAKLRGEKKEANKS EGQEGS EEADM 
RR KX \ I ESLKAKANARAAVLK EQLERKR KE A YER E KK VWEEHLV 
AXGVXSSDVS P PLGQHETGGS PSKQQMRS V 1 S VTSALKE VG VDS 
S LTD7RETS E EMQ KTNNAI SS KRE 1 1>RR LN E K L XAQEDE KG KQN 
LSDTFE INVHEDAXEHEKEXSVSSDRKKWEAGGQZ.VI PLDEliTL 
DTS FS TTERHT VG E V IKLGPNGS PRRAWGX S PTDS VL»KI 1X3 EAE 
LOLOTELLENTTIRSEISPEGEKYKPLITGEKKVQCISKEINPS 
AIVDSPVETKSPEFSEASPQMSIiKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCT I TDVWISES KETKETOSADR I TI QENEVS EDG 

T.rccPlfriOT cr\TtiT it nr^TMrNcrvuc v^Tv^rrvv c\ rr\n T? t» cctji/^ rvrn o tr 
Vbb l VLlVLoylHihFoiWUbvnbAvUVUiVbv rniW Vlibc, 

HLNLVPQVQSVOCSPEESFAFRSHSHLPPKNXNXNSLLIGLSTG 

LFDANNPKMLRTCS LPDLS KLFRTLMDVPTVGDVRQDNLE I DE I 

EDEKIKEGPSDSEDIVFEETDTDLQELQASKEOLLREQPGEEYS 

EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 

GEIASECECDSVFNHLEELRLHLEOEMGFEKFFEVYEKIKAIHE 

DEDEK I E I CS K I VON I LGNEHQti t.YAKI LKLVMADGA YQEDNDE 


5929 


3 


1558 


LDFSMTTOLPAYVAI LLFYVSRASCQDTFTAAVYEHAAI LPNAT 

LTPVSREEALALKNRNLDI LEGAITSAADOGAHI I VTPEDA3 YG 

WNFNRDSLYPYLEDI PDPEVNW I PCNNRNR FGCTPVQER1/SCL\ 

AKNNS3YWAN1GDKKPCDTSDPQCPPDGRYQYNTDWF\DSQG 

KLVARYHKQNLFMGENQFNVPKEPEI VTFNTTFGS FGI FTCFDI 

TiPKDPAVTTA^nFKVnTTVFPTAWMNrUl.PFI .^AVPFH 4 ? AWAMGM 
ur nurtw l i_i v i\lt rivL/xxvrtr I/^m v Uir n.L->C3r\ vur no/in/u'iuri 

RVNFLASN I HY P S XKMTGSG I Y APNSSRAFH YDMKTEEGKLLLS 
QLDSHPSHSAWNWTS YASS IEALSSGNKE FXGTVFFDEFTFVK 
LTGVAGNYTVC^KDLCCHLSYKMSENIPNEVYAIjGAFDGLHTVE 
GRYYL01 CTLLKCK7TNLNTCGDSAETASTR FEMFSLSGTFGTQ 
YVFPEVLLSENQLAPGEFQVSTDGRLFSLXPTSGPVLTVTLFGR 
LY E KDWASNAS SGLTAQAR 1 1 ML I VI AP I VCSLSW 


5930 


113 


6082 


RGN CFW 1 VP FTMAOKTGLEDPER YLFVDRAV I YWPATQADWTAK 
XLVWIFSERHGFEAASIKEERGDEVMVELAENGKKAMVNXDDIQ 
KMNPPKFSKVEDKAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 
C W I N P YKNIj P I YS ENI I EM YRG KKRHEM PFHIYAISE S A YRCM 

lqdr edqs i lctgesgagktewtkkviqyla31vasshrgrkdhn 
i pge \lerqllqakp i les fgnartvqndns sr fgkfi r i nfdv 
tgyivganietylleksravr0akdertfhifyollsg\agehl 
ksdlllegfnnyrflsngyi pi pgo\qdkgn frgdpgeamhimg 
fsheeilsmlkwssvlqfgnisfkkerntdoasmpentvaqkl 
chllgmnvme ftra1 ltpri kvgrdyvqkaqtkeqadfaveaiia 
ka7yerlfrwlvhr i nkaldrtkrqgasf1g i ldiagfe i feln 
sfeolcinyti^eklqqlfnhtmfileqeeyoregiewnfidfgl 

DLQ PCI DL I ERPANP PGVLALLDE ECWFP KATDKTFVEKLVQEQ 
GSHS KFQKPRQLKDKADFCI IHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQS SDRF VAELWKDVDR I VGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKES LTKLMATLRNTNPN FVRCI I PNHEKRAGK 
LDPHLVLDOLRCNGVLEGIRICRQGFPNRIVFOEFRQRYEILTP 
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SEO 
ID 
NO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Giycine, 
H=.Histidine, I=Isoleucine, K^Lysine, 
L^Leucine, (^Methionine, K^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T«Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, !X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAIPKGFMDGKQACERMIRALELDPNLYRIGQSKIFFRAGVLAH 

LEEER DLK I TDI 1 1 FFQAVCRG YIARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQWVIRVFTKVKPLLQVTRQEEELQAKDEEliLKVK 

EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 

RARLA.AKKQELEEILHDLESRVEEEEERNQILONEKKKMQAHIQ 

DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKOEVMISDLEE 

RLKKEEKTRQELEKAKRKLDGETTDLODQIAELQAQIDELKLQL 

AXK EE ELOG ALARGDDETLHKNNAL K W R ELQAQ I AE LQ EDFES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAACQELRTKRE 

CEVAELKKALEEETKNHEAQIQDMRORHATALEELSEQLEQAKR 

FKANL E KNKQGLE TDNKELACE V KVL00V KAES E H KR KKLDAQV 

QELHAKVSEGDRLRVEIiAEKASKLQNELDNVSTLLEEAE KKGIK 

FAKDAASLESQLQDTQELLOEETRQKLNLSSRIRQLEEEKNSLQ . 

EQOEEEEEARKNLEKQVLALOSQLADTKKKVDDDLGTIESLEEA 

KKKLLKD AE ALS QR LEEKALA Y DKLE XTKN RLQQ ELDDLT VDLD 

HQRQVASNLEKKQ\KKFDQLLAEEKS I SARYAEERDRAEAEARE 

KETKALSLARALEEALEAKEEFERQNKOLRADMEDLMSSKDDVG 

KN\n^ELEKSKRALEQQV\EEMRTQL»EEbEDEIiQATEDAKl>RIiEV 

NMQANKAQFERDliQTRDEQNEEKKRLLIKOVRELEAELEDERKQ 

RALAVAS KKKMEIDJjKDLEAOI EAANKARDEVI KOLRKLQAQMK 

DYQRELEEARASRDEIFAOSKESEKKLKSLEAEILQLQEELASS 

ERARRHAEOERDELADEITNSASGKSALLDEKRRLEARIAQLEE 

ELEEEQSNMELLNDRFRKTTL0VDTLNAELAAERSAAQK5DNAR 

QQLERQN KELKAKLQELEGAV KS KF RATI SALE AKI GQLEEQLE 

QEAKERAAANKLVRRTEKKLKEI FMQVE DERRKAPQYKEQME KA 

N ARMKQLKRQLEEAE EEAT RAN AS RRKL QRELDD ATE AN EG LS R 

EVSTLKNRLRRGG PI SFSS SR S GRRQLH LEGASLELSDDDTES K 

TSDVNETQPPQSE 


5931 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATOADWTAK 
KLVWI PS ERHGFEAAS I KEERGDEVMVELAENGKKAMVNKDD I Q 
KMNPPKFS KVEDMAELTCLNE AS VLKNLXDR YYSGLI YTY SGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQSILCTGESGAGKTENTKKV1QYLAHVASSHKGRKDHN 
I PGE\LERQLLQANP I LESFGNARTVQNDtf SSRFGKFI RINFDV 
TGYI VGANIETYLLEKSRAVROAKDERTFH I FYQLLSG\AGEHL 
KSDLLL.EGFWNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSKEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHLLGMNVMEFTRAI LTPR I KVGRDYVQ KAQTKEQADFAVEALA 
KATYERLFRWLVHR I NKALDRTKRQGAS FIG I LD I AG FE I FELN 
SFEQLCINYTNEKLQQLFNHTMFILEOEEYQREGIEWNFIDFGIi 
DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKL.VQEQ 
GSHS KFQKPROLKDKADFCI I HYAGKVDY KADEWLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDOVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNrNPNFVRCIIPNHEKRAGK 
LDPHLVLDOLRCNGVLEG I RICRQGFPNR I VFQEFRQRYEILTP 
NAI PKGFMDGKQACERMI RALELDPNLYR 1GQSKI FFRAGVLAH 
LEEERDLKITDIIIFFQAVCRGYLARKAFAKKQQQLSALKVLQR 
NCAAYLKLRHWQWWRVFTKVKPLLQVTROEEELQAKDEELLKVK 
EKQTKVEGELEEMERKJIQQLLEEKNILAEQLQAETELFAEAEEM 
RARLAAKKQELEEILHDLESRVEEEEERNO ILQNEKKKMQAH I Q 
DLEBQLDEEEGARQKLOLEKVTAEAKI KKMEEEI LLLEDQNSKF 
I KE KKLMEDR I AECSS QIiAEEEEKAXN LA K I RNKQEVM I S DLE E 
RLKKEE KTRQE bEKAKRKLDGETTDLQDQ 3 AELQAQ I DELKLQL 
AKKEEELQGALARGDDETLHKNNALKVVRELQAOIAELQEDFES 
EKASRNKAEKQKRDLSEELEALKTELEDTliDTTAAQQELRTKRB 
QEVAELXKALEEErKNKEAQIODMRQRHATALEELSEOLEQAKR 
FKANLEKNKQGLETDN KE LACEVKVLQQVKAESEHKR KKLPAQV 
QELHAJO/SEGDRLR VELAEKASKLQNELDNVSTljLEEAEKKG I K 
FAKD AASltES QLQDTQELLQEETRQ KLN LS S R I RQLEEEKNS LQ 
ECX3BEEE E AR KNLEKO VLALOS01ADTKKKVDDDLGT I ESLEEA 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firet 
amino acid 
residue of 
amino acic 
sequence 


Axino acid segment containing signal peptide 
iA=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
K-Histidine, I»Isoleucine, K=Lvsine, 
L«= Leucine, M»Mcthionine, N^Asparagine , 
P=Proline, 0=Glutamine, R=Arginine, 
5= Serine, T^Threonine, V=Val:ne , 
w^Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDA^ALSQRLEEKAI^.YDKLEKTKNRLQQELDDLTVDLD 
KCRQVASNLEXKQ\KKFDQL1AEEKSISARYAEERDRAEAEARE 
KETKALSLARALEEALEAKEEFERQNXQLRADMEDLMSSKDDVG 
K*r\^ELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 
NMQAMKA0FERDLQTRDE0NEEKKRLL1KOVRE1.EAELEDERKQ 
RALAVAS KKKME I DLKDLEAO 3 E AANKARDE V I XQLR KLQAQMK 
DYQRELEEARASRDEIFAQSXESEKKLXSLEAEILQLQEELASS 
ERARRHAEQERDELADE I TNS AS GKS ALLDEXRRLEARI AQLEE 
ELEEEQSNMELLNDRFRKTTLOVDTI^AELAAERSAAOKSDNAR 
QOLERQNXELKAKLQELEGAVKS XFKATI SALEAXIGQLEEQLE 
C'EAKERAAANKLVRRTEKKLKEIFMOVEDERRHADQYKEQMEKA 
KARMXQLXRQLEEAEEEATRANASRRXLQRELDDATEANEGLSR 
EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


572 


RHLEEICFLFLQXGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FG ATLAVGLT I ? VXiS WT 1 I 1CFTCSCCCLYKTCRRPRPV\APP 
PK PP/ PWHAPYPQPPSVPPSY PGPS YQGYHTMPPQPGMPAAP Y 
FMQYPPPYPAQPMGPPAYHETLAGGAAAPYPASQPPYNPAYMDA 
FXAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFS I GKMSTAKRTLSKKEQEELK JCKEDEKAAAEI YEEFLAAFEG 
SDG N KVKT F VRGG WN AAK EEH E TDE KRG K I Y XP SS RFADQKN P 
PNOSSNERPPSLLVIETXXPPLKKGEKEKXKSNLELFXEELKQI 
CEERDERHXTXGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDyAPGSHDVGDPSTT\NFYLGNl\NPQMNLKKCCCCEFGRFGP 
I^VKIMWPRTDEERARERNCGFVAFMNRRDAERALKNbNGKMI 
MS FEMKLGWGKAVP IPPHP1YI PPSMMEHTLPPPPSGLP FNAQP 
RERLKNPNAPMLPPPXNXEDFEXTLSQAIVKVVIPTERNLLALI 
HRMI EFWREGPMFEAM 1 MNRE 1 NNPMFRFLFENQTPAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYIiKGMSEEQ 
ETEAFVEEPSKKGALKEEORDKLEEILRGLTPRKNDIGDAMVFC 
LWAEAAEEIVDCITESLSILKTPLPXKIARLYLVSDVLYNSSA 
KVANASYYRKFFETKLC02FSDLNATYRTIOGHLQSENFKORVM 
TCFRAWEDWAIYPEPFLIKLONI FLGLVNI I EEKETEDVPDDLD 
GAr 3 EEELDGAPLEDVDGI PIDATP1 DDLBG VP I KS LDDDLDGV 
PLDATEDSKTCNEPIFKVAPSKWTE1AVDESELEAQAVTTSKWELFD 
V ci FFF FNONOF <5 F n F F XYTCi <3 <> K £ E FHK T .YSN P I K EEMTF 
S KF S K YS EMS E EKRAKLRE IE LKVM KFQDELES GKR PKKPG QS F 
0ECVEHYRDKLLOREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPXSERSERSER 
SHKESSRSRSSHXDSPRDVSXKAKRSPSGSRTPXRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVKSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFS I GXMSTAXRTLSXKEOEELKKXEDEXAAAEI YEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEKETDEKRGKIYXPSSRFADQKNP 
PNQSSNERPPSLLVIETXKPPLKXGEKEXKKSNLELFKEELXQI 
0EERDERHKTKGRLSRFEPPOSDSDG0RRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLXXCCCQEFGRFGP 
LAS V K I M WPRTDEE RARERN CG F VA FMNRRDAE RALKNLNGKM I 
MS FEMKLGWGKAVPIPPHP1YI PPSMMEHTLPPPPSGLPFNAQP 
REItLKNPNAPMLPPPX>JKEDFEKTLSQAIVKWIPTERNLLALI 
HRMI EFWREGPMFEAMIMNR E I NNPMFRFLFENQTPAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFXN3SFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNniGDAMVFC 
LHNAEAAEEIVDCITESLSILKTPLPKKIARLYLVSDVLYNSSA 
KVAVAS YYRKFFETKLCQI FS DLNATYRTI QGHLQS ENFKQRVM 
TCFRAWEDWAIYPEPFLIXLQNI FLGLVNI I EEXETEDVPDDLD 
GAP I EEELDGAPLEDVDGI PI DATPIDDLDGVPIKSLDDDLDGV 



405 



BNSDOCID: <WO 0i53312A1_Ls 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spondi ng 
to txrst 
amine acid 
residue of 
amino acid 
sequence 


Airo:;o acid segment containing signal peptide 
(Ar Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=H: stidine, I=Isoleucine, K=Lysine, 
L-Lc-ucine, M=rMethionine, N=Asparagine, 
P=Froline, Q«=Glutamine , R=Arginine, 
S^S-erine, T=Threonine, V= Valine, 
W='Jryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Cocz:n, /=possible nucleotide deletion, 
\=p?ssible nucleotide insertion) 








PLDtTEDSKKNEPI FKVAPSKWEAVDESELEAQAVT7S KWELFD 
OHI r S EEE ENQNQEE ESEDEEDTQSS KSEEHH LYSN P I KEEMTE 
SKF f. KYSEMSEEKRAKLRE I ELKVMKFODELESGKRPKKPGCS F 
QEC V EHYRDKLLQRE K EKELER ER3RDKXDKE KLES RS KDKKEK 
DECTPTR KERKRRKSTS PSPSRS SSGRRVKS PSPKSERSERSER 
SHieSSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCFFRSVF 


593S 


3 


4493 


S Y K 1 S G W R LS R P PRC F WAG WR G I GR FGTMAP VHGDD CE I G ASAL 
SDSG S FVS SRARREKKS KKGROEALER LKKAKAGER YK YEVEDF 
TGV V E EVDEEQ YSKLVQARQDDDW I VDDDG IG YVEDGREI FDDD 
LED D ALD ADE KG KBG KARN KDXRNVXKLAV7KPNNI K S M F I AC A 
GKKTADKAVDLSKDGLLGDILODLNTETPQITPPPVMILKKKRS 
I GAS PNP FS VH TATA V PSG K I AS PVSRKE PPLTPVPLKRAEFAG 
DDVOVESTEEEOESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWD 
KESrPAEEVKQSADSGKGTVSYLGSFbPDVSCWDIDOEGDSSFS 
VQE VO VD S SH L PLVKG ADE EQV FH FY WLD AY EDQYNQ PG WFL F 
GKViC I ESAETHVSCCVMVKNIERTLYFLPREMKIDLNTGKETGT 
F I S K KD VYE E FDEK I ATKY KI M K FKS KPV E KNYAFE I PDV P E KS 
EYLEVKYSAEMPQLPODliKGETFSHVFGTNTSSLELFLMNRKIK 
GPCWLEVKKS TALNOP VS WCKVEAMAL KPDLVNV IKE VSPPPLV 
VMAl'SMKTMQNAKNHONEIIAKAALVHHSFALDKAAFKPPFQSH 
F CVVS KPKDCI FPYAFKE VI EKKNVKVEVAATERTLLGFFLAKV 
HK1 DPDI IVGHNIYGFEbEVLLORINVCKAPHWSKICRLKRSNM 
FKLGGRSGFGERNATCGRMICDVEISAKELIRCKSYHLSELVQQ 
ILKTERWIPMENIONMYSESSOLLYLLEHTWKDA\KFILQIMC 
ELWv liPLALQI TNI AGNI MSRTLMGGRSERNEFLLLHAFYENN Y 
IVFDKOIFRKPQQKLGDEDEEIDGDTNKYKKGRKKGAYAGGLVL, 
DPKVGFYDKFI LLLDFNS LYPS 1 1 QE FNI C FTTVQR V AS EAQKV 
TEDGEOSQIPELPDPSLEMGILPREIRKLVERRKOVKQLMKQOD 
LNPrjLILQYDIRQKALKLTANSMYGCLGFSYSRFYAKPLAALVT 
YKGREI LMHTKEMVQKMNLE VI YGDTDS I M I NTNSTNLEEVFKL 
GN K V K SE VNKL YKLLE 1 DI DGVFKSLLXjLKKKKYAALWEPTSD 
GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIG01LSD0SRDTIV 
ENjOICRLIEIGENVLNGSVPVSQFEINKAJLTKDPQDYPDKKSLP 
HVKYALW INSOGGRKVKAGDTVS YVI CQDGSNLTASORAYAPEQ 
LQKCDNUTIDTQYYIAQQIHPWARI CEPI DG3 DAVLI ATGWEL 
\DF1 C FKVHHYHKDEENDALLrGGPAQLTDEEKYRDCERFKCPCP 
TGG7 ENI YDNVFDGSGTDMEPS LYRCSNI DCKAS PLTFTVQLSN 
KLI N.D I RRFI KKYYDGWLICESPTCRNRTRHLPLQFSRTGPbCP 
ACMKA TLQPEYSDKSLYTOLCFYRY I FDAECALEKLTTDHEKDK 
LKKC FFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KLFAGCAV 
KS 


5936 


1124 


139 


RGEiOFDAEFRRFACLGFGERLOEFSRJbLRAVHRSRAWTCYLAI 
RML.!<ATCCPSPTTTAC1GPWQRAPPLRLLVQKREADSSGLAFAS 
KSLC r. IRK KGLLLRPVAPLRTRP PLLI S LPQDFRQVS S VI DVDLL 
PETH RR VRLHKHGSDR PLG FY I R DGMS VR VAPQG \ LE R VPG I FI 
S RLV R GG LAES TGLLA VSDE I LE VNG I EVAGKTLNQ VTDMMVAN 
SHNXLIVTVKPANQRNNVVRGASGRLTGPPSAGPGPAEPDSDDD 
SSDI V I ENRQPPSSNGLSOGPPCWDLHPGCRHPGTRSSLPSLDD 
G;EQ/ l .SSGWGSRIRGDGSGFSL 


5937 


31 


ieoc 


PTSLLKSTVOI^iCRLLQDKRYQCVYSLAEIFKVLASFYVILVIL 
YGL7S S YS LWWMLR S S LKQYS FEALREKSNYSDI PDVKNDFAFI 
LHLADQYDPLYSKRFS I FLSE VS 2NKLKQI NLNNEWT VEKLKS K 
LVKN AQDKI ELHLFMLNGLPDNVFELTEMEVLSLELI PE VKLPS 
AVS C IrVNL KE LRVYHS S LWDH PALAFLEENL KI LRL KFTEMG K 
IPRVrvFHLKNLKEbYLSGCVLPEQLSTMOLEGFODLKNbRTLYL 
KSS LS RI PQWTDLLPS LQKLS LDNEGS KLWLNNLKKM VNLKS 
LEL } ? CDLER I PHS I FS LNNLHELDLRENNLKT VEE IIS FQHLQ 
NLS CLKLWHNNI AY I PAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKL!-:YLDLSYNHLTFI PEEIQYlASNLQYFAVTNNNI EMLPDG 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, r ^Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Me thionine , N=Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
VI -Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


1 






L FQCKKLQCLLLGKN S LMNLS PH VG ELSNLTH REP I G \N YLE Tl> 
PPELEGCQSLKRNCLIVEENLLNTLPLPVTERLQTCLDKC 


5938 


395 


1861 


YKGEGFFCNQEARGEPJSKKKKAMSSPNIWSTGSSVYSTPVFSQK 
MTVWILLLLSLYPGFTSQKSDDDYEDYASNKTVn/LTPKVPEGDV 
TVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDIFFAQTWYDRRLKFNSTIKVI,RLNSNMVGKIWIPDTFFRN 
SKKADAHWI TTPNRMLR I WNDGRVLYSLRLTI DAECQI,QLHNFP 
MDEHS CPLEFS S YG Y PR EEI V YQWKRSS VEVGDTRSWRL YQFSF 
VG LRNTTE WKTTSG D YWKS VY FDLSRRMG Y FT I QTY I P CTLJ 
WLS W VS FWI N KDAVPARTSLG3 TTVLTMTTLSTI ARKS LPKVS 
YVTAMDLFVSVCFIFVFSALVEYG\TLHYFVSNRKPSIO)KDKKK 
KN P AP TI D I RP R SAT I QMNN ATH LQERDE EYG Y E CLDGKDCAS F 
FCCFEDCRTGAWRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYW 
VSYLYL* 


5935 


66 


14 01 


I R PG YLKEVQEN5 PG H RAG LEPFFDFIVSI NGSRLNKDN DTLKD 

LLKANVEKPVKMLI YS S KTLELRETS VTPSNLWGGQGLLGVS IR 

FCSFDGAKENVWH VLE VESNSPAALAGLRPHSDYI IGADTVMNE 

SEDLFSLIETHEAKPLKLYVYNTDTDNCREVIITPNSAWGGEGS 

LGCGIGYGYLHRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTE 

VQLSSVNPPSLSPPGTTGIEOSLTGLSISSTP\PAVSSVliSTGV 

PTVP\LLPPQVNOSLTSVPPMESSYliHLPGLMPFTRQGLPNLPQ j 

PSTFNLPR\PTKSWPGVGbYQEFVKPGVXiPPLSSMPPRNLPG\I 

APIiPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 

TTAKADAASSLTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAA 

VD ANAS ESP 


5940 


145 


71' 


RRSAS RS AS PRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLPVHMGLVITEVEQEPSFSDIASLWWCMAVGISYISVYDH 
QG I FKRNNSRLMDE I LKOOOEbLGLDCSKYSPEFANSNDKDDOV 
LNCHLAVKVLS PEDGKADI VRAAQDFCOLVAOKOKRPTDLDVDT 
IA\ VYLVQMWL I L I 


5941 

i 
1 

! 

i 


13 


6147 


MCLGRMGAS S PRS P EPVGP PAPGLPFCCGGSLLAVWLLALPVA 
WGQCNAPE W\ LP FAR PTNLTDEFEFP IGTYLNYECRPG YSGR P F 
S 1 1 CLKNS VWTGAKDRCRRKS CRNPPDPVNGKVHVI KG I QFGSQ 
IKYSCTXGYRLIGSSSATCIISGDTVIWDNETP1CDRIPCGLPP 
TITNGDFI STNRENFHYGSWTYRCNPGSGGRKVFEbVGEPSI Y 
CTSNDDQVGIWSGFAPCClIPNKCTPPNVENGILVSDNRSIiFSL 
NEWEFRCQPGFVMKGPRRVKCOALNKWEPELPSCSRVCOPPPD 
VLHAERTOREKDNFSPGOEVFYSCEPGYDLRGAASMRCTPOGDW 
S PAAPTCEVKS CDD FMGOLLNGRVLFPVNLQLGAKVDFVCDEGF 
QLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSPPVIPNGRHTG 
KPLE VFP FGKAVNY TCDPHPDRGTS FDb I GEST I RCTSDPQGNG 
VWSS PA PR CG I LG HCCAPDHFL FAKLKTQTNASDFP I GTS LK YE 
CRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKTPPDPVNGMVH 
VITDIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPX 
CQRIPCGLPPTIAl^GDFISTNRENFHYGSWTYRCNPGSGGRKV 
FELVGEPS I YCTSNDDOVG1WSGPAPQCI IPNKCTPPNVENGIL 
VSDNRSLFSLNEWEFRCOPGFVMKGPRRVKCQALNKWEPELPS 
CSR VCQP P PD VLHAE RTQR DKDNFS PGQE VFYS CE PG YDLRGAA 
SMRCTPQGDWS PAAP TCEVKS CDDFMGQLLNGRVLFPVNLQLGA 
KVDFVCDEG FQLKGS S AS YCVLAGMESLWNSS VPVCEQ I FCPS P 
PVIPNGRHTGKFLEVFPFGKAVNYTCDPHPDRGTSFDHGESTI 
RCTSDPQGNGVWSSPAPRCGILGHCOAPDHFLFAKLKTQTNASD 
FPIGTSLKYECRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 

ppdpvngmvhvitd: qvcsrinyscttghrlighssaeci LSGN 

TAHWSTKP PI COR 1 P CGLPPT I ANGDFI STNR3NFKYGS VVTYR 
CNLGSRGRKVFELVGEPSI YCTSNDDQVG IWSGPAPQCI I PNKC 
TPPNVENG I LVSDNR SbFS LNE WE FRCQ PGFVMKG PRRVKCQ A 
LNKWEPBLPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEVFYSC 
E PGYDLRGAAS LHCTPOGDWS PEAPRCAVKS CDDFLGQLPHGRV 
L FPLNLQLGAKVS F Y CDEG FRL KGS S VSHCVLVGMR S LWNNS VP 
VCEHIFCPNPPAILNGRHTGTPSGDIPYGKEISYTCDPHPDRGM 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

r^r^-y vfe^Tsr^nr" "i Tin 
l_w£ 1. c ^jJUi Ji* J. iiy 

to first 
arr.ino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, GsGlycine, 

L« Leucine, M=Methionine, N=Asparagine , 
P=Proline, 0=Glut amine, R**Aroinine, 
S=Serine, T= Threonine, V= Valine, 
W~Tryptophan, Y-Tyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possibie nucleotide insertion) 








TFNLIGESTI RCTSDPHGNGWSSPAPRCELSVRAGHCKTPEQF 
FFASPTIPINDFEFPVGTSLNYECRPGYFGKMFSISCLENLVWS 
SVEDNCRKKSCGPFPEPFNGMVHINTDTQFGSTVNYSCNEGFRL 
IGSPSTTCbVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSKN 
RTSFHNGTWTYQCKTGPDGEQLFELVGERS2YCTSKDDQVGVW 
S S P P PR C I STN KCT APEVEN AI RVPGNRSF F SLTE I 1 R FRCQPG 
FVrWGSHTVQCQTNGRWGPKLPHCSRVCQPPPEILHGEHTLSHQ 
DNFSPGQEVFYSCEPSYDLRGAASLHCTPQGDWSPEAPRCTVKS 
CDDFLGOLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 

VS YTCD PHPDRGH TFNL IGESTIRRTS EPHGNG VWS S P A PR CEL 
PVGAACPHPPKI ONGHY I GGHVSLYLPGMT I SYTCDPGYLLVGK 
GFIFCTDQGIWSQLDHYCKEVNCSFPLFMNGISKELEMKKVYHY 
GD YVTLKCEDG YTLEGS P WSOCOADDRWDP PLAKCTSRTHDALI 
VGTLSGTI FFI LLII FLSWI I LKKRKGNNAKENPKEVA1 HLKSQ 
GGSSVHPRTLOTNEENSRVLP 


5942 


4509 


688 


yiiYVRNRANPLAYGISHKAYQIDPPL\RKHREO\IiVIE\VGRKL 
DK\ AQMI RFEERTG YFS STDLGRTASHYYI KYNTI ETFNELFDA 
HKTEGDI FA I VS KAEEFDQJ KVREEE I EEIiDTLLSNFCELS TPG 
GVENSYGKINILUOTYINRGEMDSFSLISDSAYVAONAARIVRA 
LFEIAiiRKRWPTMTYRLLNLSKAIDKRLWGWASPLROFSlLPPH 
MLTRLEEKKLTTOKbKDMRKDEIGHl LHHVN I GLKV KQCVHQ I P 
S VMMEAF I OF i TRTVLR VTLS I YADFTWNDQVHGTVGEPWW I W 
E DPTNDH I YHSE Y FLALKKQVI SKEAQbLVFTI P I FEPLPSQY Y 
I RAVS DR W LG AEAV CI INFQHLI LPERH P PHTELLDLQ P LP I TA 
LGCKAYEALYNFS HFNPVQTQI FHTLYHTDCtfVLLGAPTGSGXT 
VAAELAI FRVFNKYPTSKAVYIAPLKALVRERMDDWKVR3 EEXL 
GKKVIELTGDVTPDMKS3AKADLIVTTPEKWCGVSRSW0NRNYV 
QOVTILIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 
LSTAliANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGO 
HYCPRMASMNKPAFOAIRSHSPAKPVLIFVSSRRQTRLTALELI 
AFLATEEDPKOWLNI^DEREMENIIATVRDSNLKLTLAFGIGNHH 
AGLHERDRKTVEELFVNCKVQVLI ATSTLAWGWFPAHLVl I KG 
TEYYDGKTRRYVDFPITDVLQMMGRAGRPOFDDOGKAVILVHDI 
KKDFYKKFLYEPFPVESSLLGVLSDHLNAEIAGGTITSKQDALD 
YITWTYFFRRLlMNPSYYNLGDVSHDSVNKFbSHLIEKSblELE 
L£ Y CI E 1 GEDNRS I EPLTYGR I AS YY YIjKHQTVKMFKDRLKPEC 
STEELLS I LSDAEE YTDLPVRHWEDHMNSEIAKCLP I ESNPHSF 
DSPHTKAHLLLCAHLSRAMLPCPDYDTDTKTVLDQALRVCOAML 

M/ftftHnPMT imil M T T" VI T T /-\%I(\T T / m >/* , VHJT tmPrT T T*T DV1TCMUIH 

UV/\/iWU««ljV J VJjiM X J NijlQMVlyL»KW]jjUJi>oijijI L>Vr*± CJVrtni* 

HLFKKWKP I MKGPHARGRTS I ECLPELI HACGGKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPE INVG I SVKGSWDDLVEGHNELS VST 
LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 
PRFPKSKDEGWFLI LGEVDKRELIALKRVGYI RNHHVASLSFYT 
PEI PGRYI YTLYFMSDCYLGIiDQQYD/NLSORYTSESFCTGOHO 
GL 


5943 


1 


2274 


DKPTRHKTYLSSSKAKMAAAEGPVGDGELWQTWLPNHWFLRLR 
EGLKNQSPTEAEKFASSSLPSSPPPQLLTRNWFGLGGELFLWD 
GEDSSFLVVRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LS PTQHHVAL2 GI KGLMVLELPKRWGKNS EFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHWLLTSDKVIRIYSLR 
E PQTPTNV 1 1 LS E A E EES LVLNKGRAYTASLGE TAVA FDFG PLA 
AVPKTLFGONGKDEWAYPLYI LYENGETFLTY I SLLHSPGN/1 
WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPNILVIATESGML 
YHCWLEGEEEDDHTSEKSV7DSRIDLIPSLYVFECVELELALKL 
ASGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
HKLHKFLGSDEEDKDSLQELSTEOKCFVEHILCTKPLPCROPAP 
I RGFW I VPDILG PTM I CI TSTYECL I WPLLSTVHPAS PPLLCXR 
EDVEVAES PLR VLAETPDS FEKHI RS I LQRS VANPAFLKAS EKD 
IAPPPEECLOLLSRATOVFREQYILKODLAKEEIQRRVKLLCDO 
KKK0LEDLSYCREERKSLREMAERLADKYEEAKEKQED1MNRNLK 
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SEQ 
ID 

NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
Dinino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic. Acid, F=Phenylalanine, G=Glycine / 

u _ i_r - cH^ine> T — Tcril Piiri np Jf=TiVQ*i nP 
n — r. j. o C i t« A lie , i — i&viicu^^iic , i\ = juy d i ; 

L=Leucine, M^Methionine, N«=Asporagine, 
P=Proline, Q=Glutamine, R=Arginine, 

C — Cfs>--j np TcThi'^rin i VsVaT ine 

W=Tryptcphan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHSFHSELPVLSDSERDMXKELQLI PDQLRHLGMAI KQVTMK 
KDY0O0K:4EKVLSLPKPTIIbSAY0RKCIOSILKEEGEHIREMV 
KQ3NDIRNHVNF 


5944 


167 


342B 


FS I ATFTDE FEVLTEFPS ATTTTT I G I SATKTTLAGSKGKRNNT 
ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSOPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
JCKQPSVLVTFPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPS PLS SPNGKL7VASPKRGQKREEGWKEWRRS KKVSVPSTVI 
SRV3GRGGCNINAIREFTGAHIDIDKQKDKTGDR1ITIRGGTES 
TRCATQLINALIKDPDKE2DELIPKNRLKSSSANSKIGSSAPTT 
TAANTS LMG I KM TTVALS S TS QTATALTV P AI S S AS THKT I KNP 
VN\NVR PG FPVSFP\ LAYPPPQFAHALLAAQTFOOIRPPRLPMT 
HFGGTFPPAOSTWGPFPVRPLSPARATNSPKPHMVPRHSNQNSS 
GSQVNSAGSLTSSPTTTTSSSASTVPGTSTNGSPSSPSVRRQLF 
VTVV/KTSNATTTTVTTTASNTONTAPTNATYPMPTAKEHYPVSS? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEOEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSOEPRPPLQQSQVPPP 
EVRKTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PA3RPPPKGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACONSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFVJGGSWSSOSTPESMliSGKSSYLPNSDPLHOSDTSKAPG 
FRPPLQRPAPSPSGI VNMDSPYGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSS 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNODKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQKVPAGYMDFPK 
VGGMPFSVYGNAMIPPVAPI PDGAGGPI FNGPHAADPSWNSL1K 
MVSSSTEN>3GPQTVWTGPWAPHMNSVHMN0LG 


5945 


1461 


197 


GVTHLFLFGKRKLRNGIAEDLKGOADFFF1.LVSEAVVATGSPRA 
W LTCLI LPLi PG 1 1 FS VLPKAMS R PLLI T FTPATD P S DLWKDGQQ 
QPQPEKPESTLDGAAARAFYEAMGDESSAPDSQKSQTEPARER 
KRKKRRI MXAPAAE AVAEGASGRHGQGRS LEAEDKMTHK I bKAA 
QZGDLPELR R LLEPHEAGGAGGNI NARDAF W WTPLMCAARAGQG 
AAVS YLLGR G AAWVGVCELSG R DAAQLAEE AG FP E VARMVRESH 
GETRSFENRSPTPSLQYCENCDTHFQDSNKRTSTAHLLSLSQGP 
OPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTV 
LKRDOEGLGYRSAPQ PRVTHFPAWDTRAVAGRE\ T PPRVATLSW 
REERRREE \ KDRAWERDLRTYMNLEF 


5S46 


541 


1666 


1LGS YSS 1 OPEE YS \S WC\EWLODLLA\ YVS PK\HS YLRDLP 
SEGS PQRVNS I DFV\ EL \ EHLQPDVLVHAV LRWDF/ TI LTEAV 

YIWEFKYLFVQCNYTLENLELHTTPWSSCECLFDDDIRAITFKA 
KFQKSAPS FVKI SDLATHLEDKCSGWL I KAQ I S ELAFP I TASQ 
KIALNAHSSLKSIFSSLPNIVYTGCAKCGLE3.ETDENRIYKQCF 
SCLPFTMKKI YYRPALMTAIDGRHDVCIRVESKLiIEKILLNlSA 
DCLNR VI VP S S E I TYGMWADL FHS LLAVSAE PC VLKI QSLFVL 
DENSYPLQODFSLLDFYPDIVKHGANARL 


" 5947 


3 


1317 


RGIPDRRRRGPIGRVNMDLENKVKKMGLGKEQGFGAPCLKCKEK 
CEGFELHFWRKICRNC\NVAKKSM/TVL1/SNEEDRKVGK1»F3DT 
KYTTL I AKLKSDGI PMYKRNVMILTNPVAAKKNVS 3 NTVTYEWA 
c p vnw r . zx p n ymom i . p k f ko pvag<? FG AO YR KKC LAKOL PAHD 
QDPS K CHELS PRE VKEMEQFVXKYKS EALGVGDVKLPCEMDAQG 
PKQMN I PGGDRSTPAAVGAMEDKSAEHKRTQYS CYCCKLSMKEG 
DPAI Y AERAG YDKLWHPACFVCSTCHELLVDMI Y FWKNBKLYCG 
RHYCDSEKPRCAGCDEL1FSNEYT0AENQNWRLKHFCCFDCDSI 
LAG E I YVMVNDKP VC KP CYVKNHAVVCQG CHNAI D P EVQRVTYN 
NPSWHASTECFLCSCCSKCLIGOKFMPVEGMVFCSVECKKRMS 


5948 


39 


3370 


YRERYPVSGGSVliRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GKHYOMRRKGRCHRGSAARH PS S PCS VKHSPTRETXTYAQAQRM 
VEIEI EGRLKR I S I FDPLEI I LEDDLTAQEMSE CN SNKENSERP 
PVCLRTKR1^KNNRVXKKNEA1»PSAHGTPASASALPEPKVRIVEY 
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SEC 
ID 

NO: 


?redictec 
beoinninc 
nucleoticr 

lOCd l xvjn 

corresponding 
tc first 
amino acic: 
residue ci 
amino acic 
secuence 


Predicted end 
nucleotide 
location 
cor responuiny 
r.o first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAl&nine, C=Cysteine. D=Aspartic Acid, E~ 
Glutamic" Acid, ^Phenylalanine, G=Glycine, 
n — r. 1 1 - ~ ai.ie , -i — isoieucine, fts^jLiysjiic; , 
L= Leucine, M=Methior.ine, N=Asparaaine, 
F-Prcl ine, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, :x=Unkncvn, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SPPSAFRRPPVYYKFIEXSAEELDNEVEYDMDEEDYAWLEIVNE 
FIRKGDCVPAVSQSMFEFLMDRFEKESHCEKQKOGEQQSLIDEDA 
VCC3 C^DGECQNSNVILFCDMCNLAVHOECYGVPYI PEGOWI.C/ 
RAHCLCS RARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ I P 
EWGFANTVFIEPlDGVRNIPPARWKLTNCNijCKEKGR/VGACl 
OCHKANCYTAFHVTCAOKAGLYMKJJ!EPVKELTGGGTTFSVRKTA 
YC D V HT P PG CTRR P LN I YGDVE M KNG VCR KES S VKT VRS TS KVR 
KKAK KAKKALAE PCAVLPTVCAP Y I P PQR LNR I ANQVAI QRKKQ 
FVEKA.H £ Y WLLKRLS RNGAPLLRRlrQSSLQSQRS SQQRENDEEM 
KAAK EKLKY WQRLRHDLERARLLI ELLRKREKLKREQVKVEQVA 
NELRbTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHl 
KH PKDFATMRKRLEAQG YKNLHE FEEDFDLI I DNCKKYNAR DT V 
FYRAA VR LRDQGG WLRQAR REVDSIGLE E ASGMHLPER PAAA P 
RRPFSWEDVDRLLDFANRAHLGLEEOLRELLDMLDLTCAflKSSG 
SRSKF^KXLKKEIALLRNKLSQQHSQPLFTGPGLEGFEEDGAAL 
GPEAGLEVnuPRLETljLOPRKRSRSTCGDSEVEEESPGKR 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNA P K CG RGKPALVRRHTLEDRS EL I S CI ENGKYAKAAR 3AAEV 
CQSSW1 STDAAAS VLEPLKWWAKCSGYPSYPALI IDPKKPRV 
PGKKK'GVTI PAPPLDVLKIG EHMOTKS DEKLFLVLFFDN KRSWQ 
WL PKS KM VPLG I DETI DKLKMMEGRN S SIR KAVR I AFDRAMNHL 
SRVKGEPTSDLSDID 


594 9 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNH YQKR RKGRCHRGS AARHPSS FCS VXHS PTRETLTYAQAQRM 
VEI H : EGRLHRI S I FDPLE 1 1 LEDDLTAQEMS ECNSNKENS ERP 
PVCLRTKRHKNNRVKKKKEALPSAHGTPASASALPEPKVRIVEY 
SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KR KG PC V P A VSQS M FE FLMDR F E KE SH CENQKQG EQQSL I DEDA 
VCC1 CMDGECONSNVI LFC DMCN LA VHQE CYG VP Y I PEGOWLC/ 
RAHCLCS RARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E\VGFA!vITVFIEPIDGVRNIPPARWKLT\CNLCKEKGR/VGACI 
QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 
YCDVHTP PGCTRRPLN I YGDVEM KNG VCR K ESS VKTVRSTS KVR 
KKAKKAK>CALAEPCAVLPTVCAPYIPPORLNRIAN0VAI0RKKO 
FVERAHSYKLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLK YWQRLRHDL ERARLL I ELLRKR EKLKREOVKVEOVA 
MEL.RL1 PL7VLLRSVLDQLQDKDPARI FAOPVSLKEVPDYLDHI 
KH PMDF ATMRKRLEAQG YKNLHE FEEDFDL 1 1 DNCMKYNAKDTV 
FYRAAVKLRDQGGVVLRQARREVDSIGLEEASGKHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEOLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQOHSQPLPTGPGLEGFEEDGAAL 

TNGFGGAJ?SEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FN AP KCGRGKPALVRRHTLEDRSELI SCI ENGNY AKAARI AAEV 
GQSSMWI S7DAAASVLEPLKWWAKCSGYPSYPALI IDPKMPRV 
PGHHKGVT J PAPPLD VLKIGEKMQT KSDEKLFLVLFFDNKRSWQ 
WLP K S KM V P LG IDETI D KLKMMEG RNS S I R KAVRI AFDRAMNHL 


5950 


1166 


3 73 


ESRS^TMSTSQPGACPCQGAASRPAILYALLSSSLKAVPRPRSR 
CLCROH R P VQLCAPHRT CRE ALDV LA KTVAFLRNLP S FWQLP PO 
DQRR LLQG C WG PLFLLGLAQDAV T FE VAEAP V PS I LKKILLEEP 
SSSGGSGCLPDRPQPSLAAVQWLOCCLESFWSLELSPKE\YACL 
KGP I LFN F DVPGLQAASHIGHLQOEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKS I PTS LLGDLFFRP I IGDVPIAGLLGDMLLLR 


5951 


143 


5449 


WNVKPSLLWQLFKFSDKEEHEQNDS I SGKTGETG VEEMIATRK 

veods k stvklsheddh i ledagssdi ssdaactnpnktenslv 
glps cvd e vtecnlelkdtmgi adktentlernki epix5yceda 
esnrole ste fnksnle wdtst fgpesni lenaicdvpdonsk 
olnai estk: eshetanlqddrnsqsssvsylesksvkskhtkp 
viks ko) jkttdapkkivaakyev1ks ktkvnvksvkrntdvpes 
qqnfhr pvkvrkkqidke pkiqscnsgvksvknqahsvlkktlq 
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SEQ 
ID 
NO: 


Predicted 
becinnir.c 
nucleotide 

lULcl J.tJJl 

corresponding • 
to first 
am; no acid 
retidue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond i no 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
1 (A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
! Glutamic Acid, ^Phenylalanine, G»Glycine, 
H=Histidine, I=*Isoleucine, K»Lysine, 
L-Leucine, M=Methionine, N-Asparagine , 
F=Proline, Q«=Clutamine, R=Arginine, 
S-Serane, T=Threonine, V«VaIine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Ccdon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DOTLVQIFKPLTHSLSDKSHAHPGCLKEPHHPAQTGHVSHSSQK 
0CHKPO0QAPAMKTNSHVKBELEHPGVEHFKEEDKLKLKKPEKN 
LQPRQRRSSKSFSLDEPPLFIFDNIATIRREGSDHSSSFESKYM 
WTPSKCCGFCKKPKGNRFMVGCGRCDDWFHGDCVGLSLSOAOQM 
GEEDKEYVCVKCCAEEDKKTE I LDPDTLENQATVEFHSGDKTME 
CEKLGLSKHTTNDRTKYIDDTVKHKVKILKRESGEGRNSSDCRD 
WEI KKWQLAPLRKMGQPVLPRRS SEEKS EKI PKEST7VTCTGEK 
AS KPGTKEKQEMKKKKV\ EKG V3JTVHPAASASKPS ADQ I RQSVR 
HSLKDILMKRLTDSNLKVPEEKAAKVATKIEKELFSFFRDTDAK 

ykiakyrslmfnlkdpknnilfkkvlkgevtpdhlirmspeeias 
kelaawrrrenrht2emiekeqreverrpitkitkkgeieiesd 
apmkeqeaameiqepaankslexpegsek\rz<eevdsmskdtts 
ok rqhlfdlnck1 c i grmaf p vddls f kkvkvwgvar khsdne 
aesiadalsstsnilaseffeeekqespkstpspaprpempgtv 
evestflarlnfiwkgfinmpsvakfvtkaypvsgspeyltedl 
pds 3 qvggri spqtvwdy vek3 kasgtkei cwrftpvteedqi 
sytllfayfssrkrygvaannnkqvkdmyli plgatdki phplv 
p fdg pglelh r pnlllgl 1 1 rqkjb krohs acas ts k i aetpes a 
pfialppdkksk1evsteeapeeendffnsfttvlhkqrnkpq0 
nloedlptaveplmevtkqepfkpiirflpgvligwenqpttlel 
ankplpvddilqsllgttgovydoXaosvmeqntvkeipflneq 
tnski ektdnve vtdgenke1 kv k.vdni sestdksae i etswg 
sssisagsltslslrgkppdvsteafltnlsiqskqeetveske 

KTLKRQLQEDQENNIjQDNQTSN S S PCRSNVGKGNIDGNVSCSEN 

lvaktarspqfinlkrdproaagrsqpvttseskdgdscrngek 
umlpglshnkehlteqinveeklcsaeknscvqqsdnlkvaqns 
fsveniqtsqaeqakplqedilmonietvhpfrrgsavatshfe 
vgntcpsefpsksitftsrstsprtstnfspmrpogpnlqhlks 
sppgfpfpgppnfppqsmfgfpphl?ppllpppgfg\fa\qnpm 

VPWFPW\HIiP\GQPQRMMGPLSOASRY1GP0NFYQVKDIRRPE 

rrhsdpwgrqdqqqldrpfnrgkgdrqrfysdshhlkrerheke 
weoeserhrrrdrsqdkdrdrksreeghkdkerarlskgdrgtd 
gkas rdsrnvdkkpdkpks ed y ekdk ere ks khregekdrdr yh 
kdrdhtdrtkskr 


5*52 


3226 


63 9 


P PAR RS ARDLPRALSME AAR PSGS WNG ALCR LL \ L VTL \ AF1>I F 
AS DACKNVTLHVPS KLDAEKLVGRVN LKECFTAANLIHSSDPDF 
01 LEDGSVYTTNTILLSSEKRSFTI LLSNTENOEKKKIFVFLEH 
OTKVLKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 
OSDTAQNYTIYYSIRGPGVDOEPRNLFYVERDTGNLYCTRPVDR 
EOYESFE I IAFATTPDG YTPELFLPL I I KI EDENDN YP I FTEET 
YTFT I FENCR VGTTVGQ VCATDKDEP DTMHTRLKYS 1 1 GQ VP PS 
PTLFSMHPTTGVI TTTSSQLDREL3 DKYQLK 1 KVQDMDGQ YFGL 
OTTS TCI INI DDVNDHLPTFTRTS YVTS VEENTVDVEILRVTVE 
DKDIjVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCVVKPL 
N YEE KQQMI hQI GWNEAPFSREAS FRSAMSTATVTVNVEDQDE 
GPECUPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTI DENTGS I KVFRS LDREAETI KNGIYNI TVLASDQG 

Potpittt r»T TT /"\ Pit r\Tri mcdpt nvvTU T T nTMP c ti p Ttri^rnn 
(rKJ LIOI Jj(jIli»v?DVNDNi>Pr IPKJv.1 VIX^M J lMb£»AEIVAVDP 

DEFIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNTDPPF 

GS Y Wp I TVRDRLGMS S VTS LD VTLGDCI TENDCTHR VDPRI GG 

G G VC LGKWAILAI LLG I ALFFC I LFTL VCGASGTS KQ P KV I PDD 

LifWJ\Jis LiJ. Von lZnt'\yUUj\\/ j Z3t\y*\jr J i\Jl VteHjfViJto VUVjX vtj£>u 

I KNGGQETI EMVKGGHQTSESCRGAGHHHTLBSCRGGHTEVDNC 
R YTYSE WHS FTQPRLGEES I RGHTL I KN 


5953 


330 


811 


PI JiCNPriPnWYWWVKOF^FT QKF^OEMDARPKLDIjfiFKEGOTTif 
LCI GN I TNKKGGASKPRTARGG GLS LL P P PPGG KVT I P PPSS / V 
KLP S TNHVTP PS I P KSNHGGS DADI LLDLDS PAP VTTP APTP VS 
VSNDLWGDFSTASSSVPNQAPOPSNWVOF 


5554 


32 


2130 


PPPPP PKLANMADLEAVIiADVS YLMAM EKS KATPAARASKRI VL 
PEPS 3 RS VMQKYLAERNEITFDKI FNOKIGFLLFKDFCLNEINE 
AVPQVKF YEEI KE YEKLDNEEDRLCRSRQI YDAY3 MKELLS CSH 
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| SEO 
IB 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
ccr responding 
tc first 
an-..-; no acid 
residue cf 
an-.: no acid 
sequence 


Amino ecid sec^en: containing signal peptide 
<A-Alanine, C=Cysteine, D=Aspartic Acid, E: 
Cjutamic Acid, F= Phenylalanine , G=Glycine, 
K=Histidme, I = lsoleucine, K=Lysine, 
L= Leucine, M=Kethi oni ne , N=Asparagine , 
r= Proline, 0=GJut amine, R=Arginine, 
S-Serine, T=Threonine, V^Valine, 
W-Tryptophan, Y=Tyrosine, X-Unknown, *=Stcp 
Ccdon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PF S KQAVEH VQSHLS KKQVTSTLFQPY I EE I CESLRGD I FQXFK 
E S DKFTR FC0W XN VE l»M I HLTKNE FS VHR 1 1 GRGG FG E VYGCR K 
M5TGKMVAKXCLHKKRI XM KCC- ET1ALNER I MLSLVSTGDCPF 1 
VCMT YAFKT PDXLCF 1 LDLMNGGDLH YHLS QHG V F£ E KEMR FYA 
TEI I LGLEHMHNRFWyRDLKFANILLDEHGHARIS \DLGIACD 
F £ KKKPHAS VGTHGYfAPEVLOKGTAYDSS ADWFSLGCMLFKLL 
RGHSPFROKKTXDKHElDRKTLTVNVEIiPDTFSPELKSLLEGLL 
CRDVSKRLGCHGGGSOEVXEHSFFXGVDWOHVYLQXYPPPLIPP 
RGEVNAADAFDI GSFDEEDTKG1 KLLDCDQEL Y KNFPLVI SERW 
COE VTETVY E AVNADTDK I EAR XRAXNXQLGHEEDYALG KDCI H 
HGYM LKLGK F FLTQWORR YFYL FPNRLEKRG EG E SRONLLTMEO 
II.SVEETQIKDXKCIL-FRIKGGXOFVLQCESDPEFVOMKKELNE 
TFKEAQRJjLRRAPKFLNKPRSGTVELPKPSLCHRNSNGL 


S9SS 


1726 


444 


krerefrla.vcpl.rypsayesspgtebrecglcrsgoefadcrk i 
panrqdvlsgwinlpvloltkdplktpgrldhgtrtafihhreo 1 
wkrcin3 krdvglfgvlnei anseeevfewvktasgwalalcr ! 
wasslhgslfphlslrsedliaefaqvtnwsscclrvfawhpht 

NX FAVALLDDS VRVYNAS STI VPSLXHRLQRNVASLAW XPliS AS j 
VIAVACQSC1LIWTLDPTSLSTRPSSGCAQVLSHPGHTPVTSLA | 
W^PSGGRLLSASPVDAAIRVWDVSTETCVFLPWFRGGGVTNLLW j 
SFDGSKI1ATTPSAVFRVWEA0MWTCERWFTLSGRC0TGCWS PD ' 
GSRLLFTVLGEPLIYSLSFPERCGEGKG\ALEVQSQQRLWQICLj 
ROQYRHQMVRRGLGERLTPWSGTPVGNVWLOi ' 


i-956 


1705 


13S 


GVGVRGARAr<ATVQEKAAALNLSALHSPAHRPPGFSVAQKPFGA 
TYVWSSI INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDV3 FSHL- 
3 QNKYFGDVD1 PRAKWRVCQALMDYXVFEAVPTXVFGKDXXPT 
FEDS SCSLYR FTTI PNCDS 0U5KEN XLYS PARYADALFKS SDI R 
SA S LEDLWENLS 1»K PANS P H VNI S ATLS PQ VI NE VWQEET 1 GRL 
LOLVDLPLLDS LLKQQEAVPX I PQPXRQSTMVNS SNYLDRGI LK 
AYSDSQEDEWLSAAIDCSEYLPDOMWEISRSFPEOPDRTDLVK 
ELL'FDAIGRYYSSREPLLNHLSDVHNGIAELLVTJGKTEIALEAT 
QLLLKLLDFONR EE FRRLLY FMAVAANPSE FKLOKES DNRMWK 
R3F£KAIVDNKNLS:<GKTDLLVLFJ*\MDHQKDVFXIPGTL\HK1 

VS \VK\LMA2 qngrdpnrdagyi ycqridqrdysnntekttxde 
llnllxtldedsklsakexkk\llgqfyxckpdifiehfgd 


1-557 


1479 


453 


elpvavamdtldrwxpxtxrakrflekrepklneni kkamli k ! 
ggn'anatvtkvlkdvyalxkpygvlyxkxnitrpfedotsleff 

SKKSDCSLF.^FGSHNKKRPNNXjVIGRMYDYHVLDMIELGIENFV 
SLKE1 XNSKCPEGTKPML1 FAGDDFDVTEDYRRLXSLLIDFFRG 
PTVSNIRLAGLEYVLHFTA1>NGKI YFRS YKLLLKKSGCRTPRI E 
LEEfvlGPSLDLVLRRTHLASDDLYKXSMKMPKALKPKXXKNISHD 
TFGTTYGR 1 KMQXQDLS KLQTRXM\ XGLXXRPAER I T3DHEKXS 
XRJKKKLMELSOPLLFHCVLLKRIIKHQSIQSFL 




1 


3136 


AAALGMLLWFPACOAFNLDVEKLTVYSGPXGSYFGYAVDFHIPD 
ARTASVLVGAPKANTSOPD 1 VEGGAVYYCPWPAEGSAQCRQI PF 
DTTNNRXIRVNGTXEPIEFKSNQWFG\ATVXA\HXGXSCGPVAP 
LLFT14RNFLKPT PEKGP VGTC YVAIQNFSAYAEFS PCGMSKADP 
EGOGYCQAG FSLDFY XNGDL3 VGG PGSFYW0G0V1 TAEVADI I A 
NYSFXPILRKLAGEKQTEVAPASYDDSYLGYSVAAGEFTGDSQC 
ELVAG I PRGAQNFG YVS I INS YDMTFIQNFTGEQMAS YFG YTW 
VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLTLOVSSLL 
FRDPQILTGTETFGRFGSAMAHIjGDIjNQDGYNDIAIGVPFAGXD ; 
QRG KVbl YNGNTKDGLNTXPFPK FCC3GVWASHAVPSGFG FTLRGD 
SDI D KNDYPDL I VGAFG TGKVA VY3ARP VVrVDAQLIj L»H PMI IN 

S LKOXGAI XRTLFLPNHOAHRVFPLVIXRQXSHQCQDFI VYLRD 
E ?E FRDXLS PINISLNYS LDES TFXEGLE V KP I LNYYR EN IVS E 
QAH1 LVDCGEDNLCVPDLKLSARPDKHOVI IGDENHLMLI INAR 
NEGEGAYEAELFVMIPEEADYVGIERNNKGFRPLSCEYKI4ENVT 
RM WCDLGNPKVSGTNYS LGLR FAVPRLEXTNMS INFDLQ I R SS 
NKDK PDSNFVSLQIN3TAVAQVEIRGVSHP PQ I VXP IHNWEPEE 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amine acid 
residue of 
amino acid 
se<juence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalani ne, G=Glycine, 
H=Histidine, I=3soleucine, K=Lysine f 
L=Leucine, M=Methionine, N=Asparagine, 
P=?roline, Q=Glutamine, R^Arcinine, 
S=Serine, T=Threonine, V=Val^ne, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *°Stop 
Codon, /=poseible nucleotide deletion, 
\=possibie nucleotide insertion) 








EPHXEEEVG PLVEH I YELHN I GPSTI SDTI LE VGWPFSARDE FL 
LYIFHIQTLGPLQCOPNPNINPQDIKPAASPEDTPELSAFliRNS 
TI PHLVRKRDVHWEFK R QSPAKI LNCTN I ECLQI SCAVGRLEG 
GES AVLKVRS R LWAHTF LQR KNDP Y ALAS LVS FEVKKMP YTDQ P 
AKLPEGSIAIKTSVIV7ATPNVSFSIPLWVIILAILLGLLVLA1L 
TLALWKCG F FDRAR P PQE DMTDREQLTND KTP EA 


595S 


1 


1166 


GTSGYAAOOLPSLLKEREFKLGTLNKVFASQWLNHROVVCGTKC 
KTLF WDVQTS QI TK 1 P I LKDREPGG VTOOGCG I HAI ELNPSRT 
LLATG GDN PN S LAI YRLP TLD P V C VG DDG H KDW I FS I AW I £ DTK 
AV SGS RDGSMG LWE VTDDVLTKS DARHNVS RVPVYAHI THKALK 
DIPKEDTNPDNCKVRALAFNNKNKELGAVSLDGYFKLWKAENTL 
SKLLSTKLPYCRENVCLAYGSEWSVYAVGSQAHVSFLDPROPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGOGSLLFYDIRAORFL 
EERLSACYGSKPRLAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGL-KGNYAGLWS 


5960 


2B53 


870 


FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATNPLNKE 
LDWASINGFCEOLNEDFEGPPLATRLLAHKIQSPQEWEAIQALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYSWTVGLPEEVKIAEAYOKLKKQG\IVKSDPKLPDDT 
TFPLPPPRPKNVIFEDEEKSKMLARLLKSSnPEDLRAANKLIKE 
KVQEDQKRMEKI S KRVNAI EE VNNNVKLLTEMVMSHS QGG AAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EALAEILQA 
NDNLTQVINLYXQLVRGEEVNC-DATAGSIPGSTSALLDLSGLDL 
PPAGTTYPAMPTRPGEOASPEQPSASVSLLDDELMSLGLSDPTF 
PSGPSLDGTGWNSFCSSDATEPPAPALAC'APSMESRFPAOTSLP 
ASSGLDDLDLLGKTLLQ0SLPPESQQVRWEKQOPTPRLTLRDLO 
NXSSSCSSPSSSATSLLHTVSPEPPRPPCQPVPTELSLASITVP 
LESIKPSNILPVTVYDQHGFRILFHFARJDPLPGRSDVLWWSM 
LS TAP Q PI RN I VFQS AVP KVMKVKLQ P PSGTEL PAFN P I VH P S A 
ITQVLLLANPOKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKI EDFKVGNLLGKGSFAGVYRAES I HT 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIKCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHOI 3 T 
GMLYLHSHG I LHRDLTLSNLLLTRNMN I K 1 ADFGLATQLKM PHE 
KKYTLCGTPNYI SPELATRSAHGLESDVWS LGCMFYTLL1GRPP 
FDTDTVKNTLNKWJuAD YEMPTFLS I EAKDLIHQLLRRNPADRL 
SLSSVLDHPFKSRNSSTKSKDLGTVEDSIDSGHATISTA1TASS 
STSISGSLFDKRRLLIGOPLPNKMTVFPKKKSSTDFSSSGDGNS 
FYTOWGNQETSNSGRGRV J QDAEER PHSRYLRRA YSSDRSGTSN 
SQSQAKTYTKERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 
NFFKEKTS SS S GS FERPDNNQALSNHLCPGKT P FPFADPTPQTS 
TVQQWFGNLOI NAHLRKTTEYDS ISPNRDFQGHPDLQKDTSKNA 
WTDTKVKKNSDASDNAHSVKQQNTMKYI-1TALKSKPEIIQ0ECVF 

gsdplseqsktrgmsppwgyqnrtlrsitsplvahrlkpirqkt 
xkaws ildseevcvelvkeyasqeyvkevlqi ssdgntiti yy 
pngg\rgfpla\drppspt\dnisr\ysf\dnlpekywrkyqya 
srfvqlvrskspkityftryakcilmenspgadfevwfydgvki 

HKTEDFIQVIEKTGKSYTLKSE3EVNSLKEEIKMYMDHANEGHR 
ICLALESIISEEERKTRSAPFFPIIIGRKPGSTSSPXALSPPPS 
VDSN Y PTRDRAS FUR M VMH S AAS PTQAP I LK PS M VTN3GLGL7T 
TASGTDISSNSLKD CL PK S AQ LLKS V F V KNVGKATQ \ LTSGAVW 
VQ FNDGSQ L WOAG VS SIS YTS PNGO \ TTR\ YG ENE KLPDY I KQ 
KLQCLSSILLMFSNPTPNFH 


5962 


20 


2447 


R VCSS S AS TAS QA VMADAW E E I R RLAAD FQRAQ FAEATQRL S E R 
NCI E I VNKLI AQKQLE WHTLDGKEYI T PAQ IS KEMRDELKVRG 
GRVN I VDLQQVI NVDLI HI ENRI GDI I KSEKHVQLVLGQLI DEN 
YLDRLAEEVNDKLQESGQVTISELCKTYDLPGNFLTQALTORLG 
RI I SGH IDLDNRG VI FTEAFVARHKARI RGLFSAI TRPTAVNS L 
ISKYGFQEOLLYSVLEELVNSGRLRGTWGGRODKAVFVPDIYS 
RTOSTWVDSFFRONGYLEFDALSRLGIPDAVSYIKKRYKTTQLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
5 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=lsoieucine, K=Lysine, 
L=Leucine, M=Me thionine , N=Asparagine , 
P=Proline, 0=G2utamine, R=Arg:nine, 
S=Se rine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pos3ible nucleotide insertion} 








FLKAACVGQGLVDQVEAS VEEAI SSGTWVDI APLLPTS LS VEDA 
AI LLQQVKRAFS KQAS T VVFSDTVVVSE KF\ INDCTEL KKE LMH 
OKAEKEMKNNPVHLITEEDUKQISTLESVSTSKKDKKDERRRKA 
TEGSGSMRGGGGGNAKEYKIKKVKKKGRKDDDSDDESOSSHTGK 
KKPE I S FMFQDE I ED FLRKH IQDAPEEFI £ ELAEYL I K PLN KTY 
LEWRSVFMSSTTSAEGTGRKRTIKDLQEEVSNLYNNIRLFEKG 
MKFFADDTQAALTKHLL KS VCTD ITNLI FNFI ASDLMMAVDDPA 
AITSEIRKKILSKLSEETKVALTKLHNSLNEKSIEDFISCLDSA 
AEACDIMVKRGDKKRERQILFOHRQALAEOLKVTEDPALILHLT 
SVLLFQFSTHSMLHAPGRCV PQ1 IAFLNSK3 FEDQHALLVKYQG 
LWKQLVSQS KKTGQG DY P LNNELDKSQE DVA S TTR KE LQ E LSS 
S3 KDLVLKSRKSSVTEE 


5963 


62 


1130 


PVvNPODFPGNRGLMG\OKGEIGPP\GQQGKKGAPGMP\GLMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSFGEPGY 
MGLPGI QGKKGDKGNOGEKG 2 QGQKGENGROG I PGQOG 1 OGHHG 
AKGERGEKGEP G VRGA 3 GS KGESGVDG LMG PAG P KGQPGD P G PQ 
GPPGLDGKPGREFSEQFIRQVCTDVIRAQLPVLL0SGR1RNCDH 
CLSQHGSPG1PGPPGP3GPEGPRGLPGLPGRDGVPGLVGVPGRP 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPG3SKEG 
PPGDPGLPGKDGDHGKPGIOGOPGPPGICXPSLCFSVIARRDPF 
RKGPNY { 


5964 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRFGGMEEACQVQTTK 1 
RGDPHELRNI FLO YASTEVDGERYMTPEDFVQRYLGLYNDPNSN 
PKI VQLLAGVADQTKDGLI SYQEFLAFESVLCAPDSMFI VAFQL 
FDKSGNGEVTFENVXE1 FGGTI IHHHIPFNWDCEFIRLHFGHNR 
KKHLNYTEFTQFLOELQLEHARQAFALKDKSKSGMISGLDFSDI 
MVT I R S HMLTP FVE ENLVS AAGGS I SHQVSFS YFNAFNS LLNNM 
ELVRKI YSTLAGTRKDAEVTKEEFAQSAI RYGOATPLEI DI LYQ 
LADLYNASGRLTLADIERIAPIAEGALPYNLAELQRQQSPGLGR 
P I WLQ 1 AES AYRFTLGS VAGAVGATAVY? 1 DLVKTRMQNORGSG 
S WGELMYKNSFDCFKKVLR YEGFFGL YRGLI PQL I G VAPE KA I 
KLTVKDFVRDKFTRRDGSVPLPAEVLAGGCAGGSQVIFTNPLEI 
VKIRLQVAGEITTGPRVSALNVLRDLGI FGLYKGAXACFLRDI P 
FSA I Y FPVYAHCKLLLADENGH VGGLNLLAAGAMAG \ VPAASLV 
TPAD V I KTRLQVAARAGQTTY SG VI DCFRKI L\REEGPSAFWKG 
TAARVFRSSPQFG\VTLVTYELLQRGFYIDFGGLKPAGSEPTPK 
SR I ADLPPANPDH I GG Y RLATATFAG I ENKFGLYLPKFKSPS VA 
WQPKAAVAATO 


5965 


1 


1498 


MVTWLYRFLPTSNMAAKLRSLLPPDLRLQFWLHARLQKCFLSRG 
CGS YCAG AKA S PLPG KM AMGLMCGRRELLR LLQSG R R VHS VAG P 
SOWLGKPLTTRLLFPAA PCCCR PHYXFLAASGPRSLSTSAI S FA 
EVQVQAPPWAATPS PTAVPE VASGETADWOTAAEQS FAELGL 
GSYTPVGLIQNLLEFMHVDLGLPWWGAIAACTVFARCLIFPLIV 
TGQREAAR 11 1NHLP E I Q KFS SRI REAKLAGDK I E Y YKASS EMAL 
YQXKHGI KLYKPLI LPVTQAPI FI SFFIALREMANLPVPSLQTG 
GL W WPQDLTVS DP I Y I L PLAVTATMWAVLELGAETG VQSS DLQ V7 
MRNVI RMMPLI TLP I TMH FPTA VFM YWLSSNLFSLVQVS CLR I P 
AVRTVLK I PQRWHDLDKLPPREGFLES FKKGWKNAEMTROLRE 
REQRMRNQLELAARG PL RQT FTHNPLLQ PG KD>JP PN I PSS \SS S 
SSKPKSKYPWHDTLG 


596^ 


102 


1925 


RS KQ VMAR LTKRRGADT KA I OH LWAAI E I I RNQ KQ IAN I DR I TK 
YMSRVHG^PKIITTROLSLAVKDGLIVETLTVGCKGSKAGIEOE 
GYWLPGDE3 D WETENHD W Y CFECHL PGE VL I CDLC FR VYH S K CL 
S DE FRLRDSSS PWQCPVCRS I KKKNTNKQEMGTYLRFI VS RMKE 
RAIDLNKXGKDNKJ^PMTORLVHSAVDVPTIOEKVNEGKYRSYEE 

KNCFYIAKARPDNWFCYPCIFNHELDWAKWKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAW I PS EN I QD I TVN1 HRLHVKRSMG WKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAES S I SSTSNEQLKVT 
OEPRAKKGRRNOSVEPKKEEPEPETEAVSSSOEIPTMPOPIEKV 
S VSTOTKKLSASS PRMLKRSTQTTNDGVCOS MCHDKYTKI FNDF 
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SEQ 
ID 
NO: 


Predictec 
beginnijjc 
nucleotide 

cor re spending 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysceine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G-Glycint, 

L=Leucine, M=Methicnine, N=Asparegine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unkncvm, **=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




! 


KDRMKSDH KRETERWR SALE KLRSEME EEKRQAVN KAVANMQG 
EMDRKCKQ VKE KCKEE F V EE I KKLATQH KQb I SQTK K KQWCYNC 
EEEAMYHCCWNTSYCS2KCCGEHWHAEHKRTCRRKR 


5967 


102 


1925 


RSKOVMARLTKRROADTKAIQKLWAAIEIIRNOKQIAI^IDRITK 
YMSRVHGMHPKETTROLSLAVKDGLIVETLTVGCKGSKAGIEQE 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 
SDEFRLRDSSSPWQCPVCRS1KKKNTNKQEMGTYLRFIVSRWKE 
RAIDLNKKGKDNKHPMYRRLVKSAVDVPTIOEKVKEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEOADIARMLYKDTCHELXDELQLC 
KNCFYIiANARPDNWFCYPCIFNHELDWAKMKGFGFWPAKVMQKE 

CDELELHQR FLREGR FW KSKNE DRGEEEAES SISSTSN EQLKVT 
QEPRAKKGRRNQSVEPKKEEFE PETEAVSSSQE I PTMPQP I EKV 
SVSTQTKKLSASSPRMLHRSTCTTNDGVCOSMCHDKYTKIFNDF 
KDRMKSDHKRETERWREALEKLRSEMEEEKR0AVNKAVANMC3G 
EMDRKC XQV KE KCKEE F V EE1 K KLATQHKOL I S QTKKKQWCYNC 
EEEAMYHCCKNTS YCS I KCQOEHWHAEHKRTCRR KR 


5968 


81 


1288 


VRFPRRGGAPPTVliTPGROOGVFLGPQRPGSEPDIPARGQPHPP 
RPVGVS'rSAQAQVQPPAMHRRRLALGLGFCLLACTSLSVLWVYL 
ENWLPVSYVPYYLPCPEIFNMKLHYKREKPLQPWWSOYPQPKli 
LEHRPTQLLTLTPWLAPIVSEGTFNPELLQH1Y0PLNLT1GVTV 
FAVGN/HFLESAEEFFKRGYRVKYYIFTDNPAAVPGVPLGPHRL 
LSSIPIQGHSHWEETSKRRKETISQHIAKRAKREVDYIjFCLDVD 
MVFRNPWGPETLGDLVAA3KPSYYAVPRQQFPYERRRVSTAFVA 
DS EGDF Y YGG AVFGGQVAR VY E FTRGCHMA I LAD KANG 1 MAAW R 

rrcuT wtj irt^T C7Kjzr r»c? err tt 0(]t*vf T«7r^r\T> vnAnrrT t ' 7 TnrcTT 
EnSHLJVRHFJ ^NKPSKVLbPLy LWDiJKKPyPFbLiKbl Rhb J JbUK. 

DISCLRS 


5969 


1126 


503 


DVGFNIKRKRCDLDVFLESPRKPSGRRDRAPEKORRIAANKCLC 
TGVREGE PPS/TTSOKVKEAGRDFTYLI WLFGI S I TG3LFYT I 
FXSLFSSSSPSKIYGRALEKCRSHPEVIGVFGESYKGYGEVTRR 
GRRQHVR FTEYVKDGLKHTCV KF YI EGSE PGKQGT V Y AOVKEN P 
GSGEYDFR Y I FVE I ES Y PRRT 1 1 IEDNRSGDD 


5970 


316 


4712 


S0DN1GHRLLOKHGWKLGCX5LGKSLQGRTDPIP1WKYDVMGMG 
RMEMEIiDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDLRANF y CELCDKQ YQKRQEFDNH I NSYDHAHKQRLKDLK 
QREFARNVSSRSRKDEKKQEKALRRLHELAEQRKCAECAPGSGP 
MFKPTTV AVDE EGGEDDKDE S7*?N SGTGATAS CGLGS E F S TDKG 
GPFTAV01TNTTGLA0APGLAS0GISFG3KNNLGTPLQKLGVSF 
SFAXKAPVKLESIASVFKDKAEEGTSEDGTKPDEKSSDOGLOKV 
GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEG AGATE 
PEYYHYI PPAKCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 
KKGSS PKPKSCI KAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 
KAEAKKALGGDVSDQSLESHSOKVSETOMCESNSSKETSLATPA 
GKESQEGFKHPTGPFFPVLSKDESTALQWPSELL1FTKAEPSIS 
YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLOGLDPGE 
PK KS KEVGGEK I VRSSGG RMD AP AS GS ACS GliNKQE PGG S HGSE 
TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 
PPRRRRRAQDDSQRRSLPAEEGS SGKKDEGGGGSE SQDHGGRKH 
KGELPPSSCORRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 
KSPS0YSEEEEEEDSGSERSRSR5RSGRRKSSHRSSRRSYSSSS 
DASSDQSCYSRQRSYSDDSYSDYSDRSRRKSKRSHDSDDSDYAS 
SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 
SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSORSGSR 
XRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 
EXIQSRICVERKPSVSEEVOATFNJ<AGPKLKDPPQGYFGPKI.PPS 
LGNKPVLPLIGKLPATRKPNKKCEESGIiERGEEQEQSETEEGP? 
G S SDAL FG HQF P \ S EETTG PLLDP P PEES KSGE VTADHP VAPLG 
PPAHFDCYIX3DPTISHNYLPDPSDGNTLESLDSSSOPGFVESSL 
LPIAFDLEHFPSYAPPSGDPS I ESTDGAEDA\SLAPLESQPITF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locatior. 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ■ 
sequence 


Anuno acid segment containing signal peptide 
(Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=bysine, 

P^Proline, QsGlutamine, R=Arginine, 
S=Serine, T= Threonine, v^Valin*. 
W=Tryptophan, Y^Tyrosine, X=Unknown, +=stop 
Codcn, /^possible nucleotide deletion, 








TPEEMEKYSKLQQAAQQKIQQQLLAKQVKAFPASAAIjAPATPAL 
0PI HI QQPATAS ATS ITTVQHAILOHHAAAAAAA IGIHPHPH PQ 
PIiAOVHHIPOPHLTPISLSHLraSIlPGHPATFLASHPIHIIPA 
SAIHPGPFTFHPVPHAAbYPTLbAPRPAAAAATALHbHPLLHPI 
FSGQDLQHPPSKGT 


5971 


53 


2149 


SFLYFVGVDMDNPIGNWDGRFDGVQLCSFACVESTIIjLHINDII 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RS2LFYTLNGSSVDSQPQSKSKNTV1YIDEVAEDPAKSLTEIST0 
FDR3S P P LQPPPVNS LTTENRFHSLPFS LTKM PWTNGS 1 GHS PL 
SLSAOSVMEELNTAPVQESPPLAMPPGNSHGLEVGSLAEVKENP 
P FYGVIRWI GQPPGLNEVLAGLELEDECAG\CTDGTF/ REGTRY 
FTCALKKALFVKLKSCRPDSRFASLQPVSNOIERCNSLAIVJEAY 
LSEWEENTPTQKWEKEGLEIMIG\KKKGIOGHYNSCYLDSTLF 
CL FA FS S VLDT VLLR P KEKNDVE Y YSETQEbbRTE I VN PLR I YG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLNILFHHIliRV 
BPLLKIRSAGQKVQDCYFlQIFMEl^EjWGVFi XUKjhubwbtr i W 
SNbKFAEAPSCLIIQMPRFGKDFKLFKKIFPSbEbNITDLbEDT 
PRQCRICGGlAMYECRECYDDPniSAGKlKQFCKTCNTQVHLHP 
KRLN>3KYNPVSLPKDLPDWDWR«GCIPC0^ELFAVLCIETSHY 
VAFVKYGKDDSAWLFFDSMADRDGGQNG FNI PQVTPCPEVGEYL 
KMSLEDLKSLDSRRIQGCARRLLCDA1YVPCT0SPTMSLYK 


5972 


440 


1761 


I LLAG S P S PR DQCSQRQS SGGDKEbVTRGCT F ST AWS PS AMTQ 
EPFREBIAYDRMPTbERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SKKTWVFSVLMGSCLLVTSGFSLYLGNVFPAEMDYLRCAAGSCI 
PSAIVSFTVSRRNANVIPNFQILFVSTFAVTTTCLIKFGCKLVI. 
NPSAIN I NFNLI bbbbLELbMAATVI I AARSSEEDCKKKKGSMS 
DSANI LDEVP FPARVLKS YS WE VI AGISAVLGG 1 1 ALNVDDSV 
SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 
TS P LLFTASG Y bS FS I MR I VEMF KDY ? P Al KP S Y DVLLLbLLLV 
bLLQA/GPQKGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 
G<?EPPEGVROGESLESRRGANGPVTPRRGKRVAAPSLAPGMETH 
NP 


5973 


65 


. 2007 


HGDGKDbFGHl WAWRSNGI I SNFRRSPHAGMAEDEPDAKSPKTG 
GRAPPGGAEAGEPTTLLQRLRGTISKAVONKVEGILODVQKFSD 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLBE 
HTDTCLPKQSVYDAYRKYCESLACCRPbSTAKFGKI IRE IFPDI 
KARRLGGRGOSKYCYSGIRRKTbVSMPPbPGLDLKGSESPEMGP 
E VTPAPRDELVEAACALTCDWAERI LKRSFSS I VEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERLAQPP KDbEARTGAG P LARG ERKKS WES S APGANN LQV 
NAbVARLPLLbPRAPRSLI PPI PVSPPIbAPRLSSGALKVATLP 

Lo o KAwArPAAV ir 1 1 JL Fin lUf i v FAurb P \jr\3r*aru± rtr\3\ztLj 1 y iris. 

G TENRE VG IGGDQG PHDKG ViGRTAE VP VS EASGQAPPAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRb 
PWETWGSGGEGNSAGGAERPGPMGEAEKGAVbAOG\OGDGTVSK 
GGRGPGSOHTKEAEDKIPLVPSKVSVIKGSRSQKEAFPLAKGEV 
DTAPOGNKDbKEHVLQSSLSQEHKDPKATPP 


5S74 


4293 


2200 


LGLQMHTTSGR I KQAMVTS LN EDNESVTVE Wl ENGDTKGK \E I D 
LESIFSLNP\Db\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ AS I KNDPPS \RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
D1SPVQAAKXEFGPPSRRKSNCVKEVEKLQEKREKRRLQ0QELR 
EKRAQDVDATNPNYE I MCMIRDFRGSLDYRPLTTADP I DEHR I C 
VCVRKRPLNKKETQMKDLDVITIPSKDWTWHEPKOKVDLTRYL 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVET I FERGMATCFAYG 
OTGSGKTHTMGGDFSGKNODCSKGIYALAARDVFLMLKKPNYKK 
LELQV YATFFE I YSGKVFDLLNRKTKLRVbEDGKQQVOWGbQE 
REVKCVEDVLKLlDIGNSCRTSGQTSANAHSSRSHAVFQIIbRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECI RALGRKKPHTPFRASKIiTQVLRDSFl GENSRTCMIATISPG 
MASCENTl^Tl^YAhFRVKELTVDPTAAGDVRPIMHHPPNQI\DD 
LETOWGVGSSPORDDbKbbCEQfJEEEVSPQLFTFHEAVSQMVEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
3 ocat icr. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Fredicted end 
nucleotide 
location 
correepcndinc 
to first 
amino acid 
residue oE 
amino acid 
sequence 


Amino acic segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylal&nine , G^Glycine , 
H=Hictidine, I»Isoleucine , K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
?=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








EEQWEDH RAVFQE S I RWLEDEKALLEMTEEVDYDVDS YATQLE 
A I LEQKI D I LTELRDKVKS FRAALQEEEQASKQ 1 N F KRPRAL 


5975 


4293 


220C 


LGL0KHTTSGR1 HQAMVTSLNEDNESVTVEWIENGDTKGK\ E I D 
LE5I FSLNP\ DL\ VPDSEIEPSP\ETPPPPASSAKVNKI VKNRR 
?V\ AS 1 KNDPPS \ RDNRWGSARARPSQF?EQF£SAQQNGSV\S 
D I S PVOAAKKEFGP PS RRKSNCV KEVEKLQE KREKRRLQQQELR 
EXRAQDVDATNPNY E I MCM1 RDFRGSLDYR PLTTADP I DEHR I C 
VCVRKR PLNKKETQMKDLDVI TI PS KDWMVHE PKQKVDLTRYL 
ENQTFR FDYAFDDS APNEMV YR FTAR PLVET I FERGMATCFAYG 
OTGSGKTHTMGGDFSGKNCDCSKGIYALAARDVFLMLKKPNYKK 
L ELQ VYATFFE I Y S G KV FDI i J >I3 R KTK LRVLE DGKQQVQ WGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSKAVFQIILRR 
KGKLHGKFSLIDLAGNSRGADTSSADKQTRLEGAEINKSLLALK 
ECI 3ALGRNKPHTPFRASKLTQVLRDSF3 GENSRTCMI ATI S PG 
MASCENTLNTLRYANRVKELTVDPTAAGDVRP 1MHKPPNQI \DD 
LETQWGVGSSPQRDDLKIiLCEQNEEEVSPQLFTFHEAVSQMVEM 
EEQWEDHRAVFQESIRWLEDEKALLEMTEEYDYDVDSYAl'QLE 
AILEOKID I LTELRDKVKS FRAALQEEEQASKQ1NP KRPRAL 


5S76 


20 




VHHbHLTRVSWVNLDIILRJAOOMGlKTLNLVLG\LKRA\LEF 
P E VS WMEV KD PNMKGAM LTNTG K YA I PT I DA \ EAYA I G KKEKP P 
FLPEEPSSSSEEDDPI PDELLCL I CKD 3 MTDA WI PCCGNS YCD 
ECIRTALLESDEHTCPTCHQNDVSPDALIANXFLRQAVNNFKNE 
TGYTKRLRKQLPSPPPPIPPPRPLI0RNLQPLMRSP1SRQQDPL 
M I P VTS S S TH PAPS 3 S S LTSNQS S LAP P VS GN P S S AP AP VPDI T 
ATVS I S VHSEKS DG P FRDSDNKI LPAAALASEHSKGTSS I AI TA 
LMEEKGYQVPVLGTPSLLGQSLLHGQLIPTTGPVRINTARPGGG 
RPGWEHSNKLGYLVSPPQQIRRGERSCYRS1NRGRHHSERSQRT 
CGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSV?PPGFPPAPANLSTPWVSSGV0TAHSNTIPrTO 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRSFSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 
RSHSRS YSRSPP Y PRRGRGKSRNYRSRSRSHGYHRSRS RS PPYR 
RYHSRSRSPQAFRGQSPNKRNVPQGETEREYFNRYREVPPPYDM 
KAYYGRSVDFRDPFEKERYREWERKYKEWYEKYYXGYAAGAQPR 

NYPEKLSARDGHNQKDNTKSKEKESENAPGDGKGNKHKKHRKRR 
KGEESEGFLNPELLETSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDEPMDAESITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
S KKEN I VKPAKG PQEKVDG\DVRDLLDLNL\OLKKPKEETPKDL 
TILNHHLPLRRMKKSL\EPP\EKLTLNQOK\TPRNKTSQRGKSE 
EGLFORCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSKFQCLSLHS INH1 LHPGAG VAAG PATGW / RE YLT 
PVLKES KFKETGVI TPEEFVAAGDHLVHHCPTWQWATGEELKVX 
AYLPTGKQFLVTKtlVPCYKRCKQWEYSDELEAIlEEDDGDGGWV 
DTYHNTG I TGI TE AVKE 1 TLENKDN I RLQDCSALCE EE EDEDEG 
EAADMEEYEESGLLEfTDEATLDTRKIVEACKAKTDAGGECtAIIjQ 
TRTYDL Y I TYDKY YQTPRLWLFGYDEQRQ PLTVEHM YEDISQDH 
VKKTVTI ENHPHLPPPPMCSVKPCRHAEVMKKI I ETVAEGGGEL 
GVHMYLLIFLKFVQAVIPTI EYDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCCSPLTWAPGFYRRFDLATSGRRLRGQTAEPAGRO 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQLRELViaCRWAEEVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMH 
GGHTFKPLAEIYEQHVTKVNEEVAKLRRRLMELISLVQSVERNV 
EAVRNAKDERVRE I RNAVEMM I ARLDTQLKNKLI TLMGQKTSLT 
OETELL ES LLQE VE HQLRS CS KSELISKSSEI LMMFQQ VHRKPM 
AS FVTTP VP PDFTS E LVPS YDS ATFVL EN FS TLRQRAD P VYS P P 
LOVSGLCWRLKVYPDGNGWRG YY LS V FLELS AGLPETS KYEYR 
VEMVHOSCNDPTKNIIREFASDFEVGECWGYNRFFRLDLLANBG 
YLNPONDTVILRFQVRSPTFFQKSRDOHWYITQLEAAQTSYIQQ 
I NNLKERLTI ELS R TQKSRDLS P PDNHLS PQNDDALETRAKKS A 
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SEC 
ID 
NO : 


Predicted 
beginning 
nucleotide 
loca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secuence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
str.ino acid 
residue of 
cieno acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylaianine, G=Glycinc, 
H=Histidine, 1 =lsoieucine, K^Lysine, 
L«Leucine, M^Methionine, N=Asparagine , 
P*Proline, Q=Glutamine, R=Arcinine, 
S=Serine, T=Threonme, V^valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion} 








CSDMLLER \GPYSAS \ VR EAKEDEEDEEKI QNEDYHKELSDGDL 
DLDLVYEDEVtfQLDGSSS SASSTATSNTEEND I DEETMSG ENDV 
E YNNM ELEEG ELMEDAAAAGPAGS S KG Y VGSSSR I SRR TKLC5 A 
ATS S LLC I DPLI LIKLLDLKDRSS I ENL WG LQP R PP AS LLQ PT A 
£ YSRKDKDQR KQQAMWRVPS DLKMLKRLKTQMAE VRCMKTDVKN 
TL S E I KSS 5 AAS GDMQ7S L FS ADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRKSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRAL1HGSIGD3LPKTE 
DRQC KALD SDAVWAV FSGLPAVE KR RKMVT LG AN AKGGH LE3L 
GMTDLENNSETGELQPVLPEGASAAP2EGMSSDSDIECDTENEE 
Q EEHTS VGG FHDS FMVMTQFPDEDTHSS FPDGEQIGPEDLS FNT 
DENSGR 


5975 


212 


3665 


LPDMTMYLWLKLLAFGFAFLDTEVFVTGQSPTPSPTDAYLNASE 
TTTLS P SGS AVISTTTI ATTPS KPTCDEKYAN I TVDYLYNKETK 
LFTAKLNVNENVECGNNTCTNNEVHNLTECKNAS VS I SHNSCTA 
PDKTLILDVPPGVEKVPVHCCSXOVEQPDSTIWLKWKNIETSTC 
DTQN I T YRFQCGNMI FDN KEI KLENLEPEHE YKCDS EI LYNS HK 
FTNASKI I KTDFGS PGEPOI I FCRSEAAHOG VI TWNPPQRSFHN 
FTLCYI KETEKDCLNLDKNLI KYDLONLKPYTKYVLSLHAY 1 1 A 
KVQRNG S AAMCHFTTK SAP ? SQVViNMTV SMT SDN SMHV KCR P PR 
DRNGPHERYHLEVEAGNTL\mNESHKNCI)FRVKDLQYSTDYTFK 
AY FHNGDY PGEP FI LHHS TS YNSKAL I AFLAFLI I VTS I ALLW 
LYKI YDLHKKRS CNLDEQQELVERDDEKQLMNVEPIHADI LLET 
YKRKIADEGRLFLAEFQSIPRVFSKFPIKEARKPFNQNKNRYVD 
ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DET VDDFWRMI WEQKATVI VMVTRCE EGiVRNKCAEYWPSMEEGT 
RAFGECCCKDLTKHKRCP\DYI IQKLNI VNKKEKATGREVTHI Q 
FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGP1WHCSAGVGR 
TGTY I G I DAMLEGLEAEN KVDVYGY WKLRRORCLMVQVEAQY I 
LI HQALVEYKOFGETEVNLS ELHPYLHNMKK.RDP PSEPSPLEAE 
FQRLP S YRSWRTQKIGKQEX ENKSKNRNSNVI PYDYNRVF LKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 

EGKQTyGDIEVDLKDTDKSSTYTLRVFELRKSKRKDSRTVYQYO 
YTNWS VE QLPAE P KELI SMI 0 WKOK LPOKNSS EGNKHHKST PL 
LIHCRDGSOQTGIFCALLNLLESAETEEWDIFQWKALRKARP 
GMVSTFEQYQFLYDVI AS TYPAQNGQVKKNNHQEDKI EFDNEVD 
.KVKQDANCVNPLGAPEKLPEAKEQAEGSEPTSGTEGPEHSVNGP 
ASPALNQGS 


598 C 


3 


2363 


DAWGCKLRRLRFTYGTOTRVSLALPGQYELVHTLVAHOGNWETI 
PEEDLEVQENNEDAAHDLTELEVTKHKALLQEVDVVVAPCOGLR 
PTVDVLGDLVNDFLPVITYALHKDELSERDEQELOEIRKYFSFP 
VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 
CGAPGODTKAOSMLVEOSEKLRHLSTFSHOVLQTRLVDAAKALN 
LVHCHCLDIFINQAFDMORDLQITPKRLEYTRKKENELYESLMN 
IANRKQEEMKDKIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTREI KCCI RQ IQELI I SRLNQAVANKLI SS VDYLRES FVGTL 
ERCU3SLEKSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
ML WEO I KQ I I QR I TWVS P FA I TLEWKR KVAQEAI ESLSAS KLAK 
S I CSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
DHAPRLARLSLESRSLQDVLLHRKPKLGQELGRGQYGVVYLCDN 

wgghfp calks wp pdekhwndlale fk ymr slpkherlvdlkg 
s v i dynyggg ssi avll1 merlhrdlytglkagltletrlq lal 
dwegirflhsqglvhrdiklknvlldkqnrakitdlgfckpea 
mmsgs i vgtpi hkapelftgk ydws vdvya fgilfwyicsgs vk 
lpeafercaskdhlvmnvrrgarperlpvfdeecwqlmeacweg 
dplkrpllgivqpmlqgimi^lcksVnseopnrglddst 


5S8I 


1 


251$ 


GRKHSAAME'RPWGAADGLSRWPHGLGLLLLLOLLPPSTLSQDRL 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
G\EDEECGRVRDFVAiO>A^THQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIKTFGOSKLYRSEDYGKNFKDITDLINNTFIR 
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i SEQ 
l ID 
NC: 

i 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predictec end 
nucleotide 
location 
corresoonding 
to first 
amino acid 
residue of 
amino acid 
sequence- 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
K=Histidinc/ I=lsoleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEFGMAIGPEKSGKWLTAEVSGGSRGGRIFRSSDFAKNFVQTD 
LPFHPLTQMMYSPQNSDYLLWjSTENGLWSKNFGGKWEEIHKA 
VCLAKWGSDNTI FFTT YANGS CKADLGALELWRTSDLGKSFKTI 
GVKIYSFGLGGRFLFASVMADKDTTRRIHVSTDQGDTWSWAQLP 
S VGQEQFY S I LAAN DD M V FMHVDE PGDTG FGTI FTS DDRG I VYS 
KSLDRHLYTTTGGETDFTNVTSLRGVYI TSVLSEDNS IQTMITF 
DOGGRWTHLRKPENSECDATAKl^KI^CSLHIHAS YS I SQKLKVP 
MAPLSEPNAVGIVXAHGSVGDAISVMVPDVYISDDGGYSWTKML 
EG PHY YT I LDSGG 1 1 VA IEHS S R PI NVI KFSTDEGQCWQTYTFT 
RD P I YF TG LASEPG AR S MT41 S I WG FTES FLTSQWVS Y T I D FKD 1 
LERNCEEKD YTI KLAK STDPEDYE DGCI LG YKEQ FLR LRKSS VC 
QNGRDY WTKQPS I CLCSLEDFLCDFGYYRPSNDS KCVEQPELK 
GKDLEFCbYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 
CTSNFLS PE KQNS KSNS VP1 1 LAI VGLMLVTWAG VL 1 VXKY VC 
GGRFLVHL»YSVLQOH\AEA\NGVDGVDALDTASHTNKSGYHDDS 
DEDuLE 


5982 


56 


231fc 


ATR ? PRGS S WCROFS RTASAAPGRSiVMLR I PVRKALVGLS KS PK 
GC VRTTATAASNL1 EVFVDGOS VMVEPGTTVLQACE KVGMQI PR 
FCYHERLSVAGNCRMCLVEIEKAPKWAACAMPVMKGWNILTNS 
EKSKKAREGVMEFLLAI'JHPLDCPlCDQGGECDLQDQSMMFGNDR 
SRFLEGKRAVEDKKIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 
LGTTGRGNDMQVGTY 3 EKMFKSELSGNI IDI CPVGALTSKPYAF 
TAR PWETR KTE S ID VK DAVGSN I WS TRTG E VMRI L PRM HED I N 
EEW I S DKTR FA YDGLKRQRLTE PM VRNE KGbLT YTS W EDALS R V 
AGMLOSFOGKDVAAIAGGLVDAEALVALKDLLNRVDSDTLCrEE 
VFPTAGAGTDLRSNYLLNTTIAGVEEADWLLVGTNPRfEAPLF 
NAR 2 R KS WLHNDLKVALI GSPVDLTYTYDHLGDS PKI LQD I ASG 

TSGVTGDWKVMNI LHR I ASQVAALDLG YKPGVEAI RKNPPKVX.F 
LLGADGGC1TRQDLPKDCFIIY0GKHGDVGAPIADVILPGAAYT 
EKS ATYVNTEGRAQQTKVAVTPPGLAREDWKI I RALSE I AGMTL 
P YDTL \ DQ VRNR LEE VS PNIATR YDD I EG \ANYFQQANEbS KL VN 
OCLLAJDPLVPPQLTMKDFYMTDSISRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5983 


248 


1763 


E ARGDGGRRRHRASG R RAGRG EP \ AGL KSQGQRA V P KRA VARGG 
RQ \ Y S AA I ALL.E P AGS E 2 ADD LS I L YS NRAACYLKEGN CSG CI C 
DCKRAL2LHPFSMKPLLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GLOLANDSVNRLSRILMELDGPNWREKIjSLIPAVPASVPLOAWK 
P AKEM I SKQAGDS S SHR OQGI TDEKTFKALKEEGNQCVNDKNY K 

QUuOGl^AFYRRALAKKGliKNYOKSLIDI.NKVILLDPSl 3 EAK 
MELEEVTRLLNLKDKTAPFNKEKERRKIEIQEVNEGKEEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
STRKDKEACAHLIiAITAPKDLPMFLSNKLEGDTFLLLIQSLKNN 
LIEKDPSLVYQHLLYLSKAERFKMMLTLISKGOKELIEOLFEDL 
SDTPNNHFTLEDI QALKRQYEL 


5984 


755 


1192 


SSVCMACTYVSNliGKKQRSVSFLASGLMRVSTGPELRLHHSFVL 
TGDVGRRICRLLVGLFTKGDTSSKRVHPFSPGPCFliLCDLARVG 
SSPKINVSPFYON\QTSTORSCTVFVMQRCSLVGPFQVTVFTMY 
FHHSLRSISRFSSG 


5985 


22 


1406 


RRVARPGTA£PAJG^RRTVRRGRARRDLAGAERKAGVSERGDSGR 
R R PN PS I PS AAAGMS H J QI P PGLrTELLQG YTVEVLRQQP PDL VE 
FAVEY FTRLREARAPAS V LP AATPRQSLGH P PPE PG PDRV ADAK 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
H P XTDE QRCRLQEACKDI LL FKNLDQEQLS QVLDAM FER I VKAD 
EHVI DOGDDGDNFYVI ERGTYDI LVTKDNQTRSVGQYDNRGS FG 
ELALMYNTPRAATI VATS EGSLWGLDRVTFRRI I VKNNAKKRKM 
FESFIESVPLLKSLEVSERMKITOVIGBKIYKR/DGERIITQGE 
K\ADSFYIISSGEVSIItIRSRTKSNKDGGNQEVEXARCHKGQYF 
GEZALVTNKPRAASAYAVGDVKCLVKDVQAFERLLGPCMDIMKR 
N 1 SHYEEQLVKMFGSSVDLGNLGQ 
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SEQ 
ID 

no: 


Predicted 
beginning 
nucleoti de 
location 
corresponding 
to first 
amino acid 
residue of 
an-.ino acid 
sequence 


prediciec end 
nucleoli rie 
locat i c: 
corresponding 
to first 
amine ecic 
residue cf 
amino acid 
sequence 


Amino acic segment containing sicnal peptide j 
(A=A3anine, C=Cysteine. D=Aspartic Acid, E= j 
Glutamic Acid, F=Fhenyialanine, G=Glycine, 
K=Hietidine, I=Isoleucme, K=Lysine, 
LsLeucine, M=Methionine , N=Asparasine, | 
PsFrcline, 0=Glutamine, R=Arginine, 
S=5erine, T= Threonine. V=Valine, i 
W=Tryptophan, Y=Tyrosine, X=Unkncwn, *=£top j 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1806 




DAWKCTSLTFKWKLWGRHRGRRKG1AHPKNHLSPO0GGATPQVP 
SPCCRFDSPRGPPPPRLGLLGALf-iAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LI£NVCSIGDKVA0E1.PQGSDLGI-1AEEAERPGEK\AG0KSPLRE 
EHVTCV0S3bDEFLQT\yGSLl?LSrDEWEKLEDIFQCEFSTF 
SRKGL.VLQL1 OS YQRMPGNAMVRG FRVAYKRHVLTMDDLGTLYG 
QNW LiNDQVMN MYGDLVMDTVF E K \ VHFFNS FFY\DKLRTKGYDG 
VKRKTKTtfVD I FNKELLL I PIKLE VHWSLISVDVRRRTI TYFDSQ 
RTLKRRCPKKIAKYLOAEAVKKDRLDFHQGWKGYFXMNVARQNN \ 
DSDCGAFVL0YCKHLALSQPFSFTQQDMPICLRR01YKELCHCKL j 
TV 


5987 


1806 




DAW KS TSLTFH W KLWG R H RGR K R G LAHF KNHLS P0QGGATPQ VP 
S PCC RFDSPRGF P F PRLGLLGALMAEDGVRGSP PVP SGP P MEED 
GLRWTPKSPLDPDSGLLSCTLFNGFGGQSGPEGERSLAPPDAS3 
LISJJVCSIGDHVAQEIjFQGSDIGMAEEAERPGEK\AGOKSPLRE \ 
EHVTCVQS1LDEFLQT\YGSLIFLSTDEWEKLEDIFQQEFSTP i 
SRKGLVLQLIQSYORMPGNAMVRGFRVAYKRHVLTKDDLGTLYG 
CN WLNDOVMN MYG DLVMDTVPE K \ VHFFNS FFY \ D KLRT KGYDG 
VKRWTKNVD1 FNXELLLIPIHLEVHWSLISVDVRRRTITYFDSQ 
RTLN RRCPKH I AK YLQAEAVKKDR LDFHQGWKG Y FKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFSFTOQDMPKLRROIYKEljCHCKL 
TV ? 


5988 


1292 


4 3( 


FKKVFLS FLGLLESSHSRDRI HNLA^MFLLATHNLVWWFTCRFC 
RLDCI YLNAG 1 MPNPQLNI KALLFGLFS\ AEGLLTQGDK 1 TADG 
L0EVFETDVFGHFILIRELEPLLCHSDNPSOLIWTSSRNARKSN j 
FSLEDFQHSKGKEPYSSSKYATDLLSVALNKUFNQQGLYSNVAC '' 
PGTALTNLTYG I LPPF1 WTLLMPA I LLLRFPANAFTLTp YNGTE 
ALVWLFKQKPESLN PLI KYLS ATTG FGRNY 1 MTQKMDLDEDTAE 
KFYOKLLELEKKIRVTIQKTDNQARLSGSCL ! 


59B9 


194 


2611 


AMDF PQH SQH VLEQLNQQRQLG LbCDCTF WDG VH FKAH K A V LA 

ACSEYFKMLFVDOKDWHLDISNA^.GLGQVLEFMYTAKLSLSPE 

NVDDVL\ AVATFLQMQDI I TACHALKSLAEPATS PGGNAE/OAT 

EGGEKRAXEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGOAQ 

SAASGAEOTEKADAPREPPPVELKFDPTSGMAAAEAEAALSESS 

EQEMEVEPARKGEEEQKEQEEOEEEGAGPAEVKEEGSOLENGEA 

PEENENEESAGTDSGOELGSEARGLRSGTYGDRTESKAYGSVIH 

KCEDCG K E FTKTGN FKR H I R I H TG E KP FS CR ECS KA FS DP AAC K 

AHEKTHSPLKPYGCEECGKSYRL] SLLNLRKKRKSGEARYRCED 

CX3KLFTTSGMLKRH0LVKSGEKPYQCDYCGRSFSDPTSKMRHLE 

TKDTDKEHKCPHCDKKFNOVGN^KAHLKIHIADGFLKCRECGKQ 

FTTSGK LKRHLRIHSGE KPYVC I H CQRQFADPGALQRHVR I HTG 

EKPCOCVMCGKAFTQASSLIAHVROHTGEKPYVCERCGKRFVQS 

SQLANH3 RHHDNIRPHKCSVCSKAFVNVGDLSKHI 1 IHTGEKPY 

LCDKCGRGFNRVDNLRS-IVKTVHOGKAGIKILEPEEGSEVSWT 

VDDM VTIJVTE AliAATAVTQLT VV P VG AAV TADETE VLKAE I SKA \ 

VXQV0EEDPNTHILYACDSCGDKFUDANSLAQHVR3HTA0ALVM 

FQTDADFYQQYGPGGTWPAGQVLOAGELVFRPRDGAEGQPAJLAE 

TSPTAPECPPPAE 


5990 


2 


4 700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGECVLLHEEAGD 
SGFVSLSRLGPELRDKDLEMEELML0DETLLGTM0SYMDASL1S 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEEVASFSGQILAGELDNCVSSIPDFP 
MHIACPEEEDKATAAEMAVPAAGDES I SSLSELVRAMHPYCLPN 
LTHLASLEDELOEOPDDLiliPEGCWLEIVGQAATAGDDLEIPV 
WRQV S PG PR PVLLDDS LETS S ALC LLMPTLESETEAAVPXVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPONPPANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGOSTVGTEVTS 
QVDNLQKQPQEELQKESGPLQGKGK PRAWARAWAAALENSS PKN 
LERSAGQSS PAKEGPLDLYPKLADTI QTNPI PTHLSLVDSAQAS 
PMPVDS VEADPTAVG P VLAGPVPVDPGLVULASTS S ELVE PLPA 
EPVLIKPVLADSAAVDPAVVPISDNLPPVDAVPSGPAPVDLALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arcinine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQLLVESES 
LDPP KTI I P E VX E WDSLKI ESGTS ATTHE ARPR PLS LS EY R K R 
RQQRQAETE E R S PQ P PTG KWPS L P ET PTGLADI P CLV I P ? A PAK 
KTALQRSPE7PLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLIARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSKGPVLPDPFTHYAPLPSWPCYPKVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPLGWGPGPQKAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
KKVSALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRE 
KPPLPATKAVPTPRCSTVPKLFAVHPARLRKLSFLPTPRTQGSE 
DVVQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 

gnsggvdipqekrpldrlqapelanvagltppatpphqlwkpla 
avsllakaks pks ta0egtlkpegvt2aki i paavrlqegvhgfs 
rvhvg5gdkdyc\vrsr7ppkk\mpallipevgsrwnvkrhqdi 
tikpvlslgpaappppciaasrepldhrtsseoadpsapclaps 
sllspeaspcrndmntrtppepsakqrskrcyrxacrsaspssq 
gwqgrhgrnsrsvssgsnrtseasssssssssssrsrsrslspp 
hkrwrrsscsssgrsrrcssssssssssssssssssssrsrsrs 
psprrrsdrrrryssyrshdhyqrqrvlokeraieerrvvfigk 
ipgrmtrselkqrfsvfgeieectihfrvqgdnygfvtyryaee 
afaaiesghklroadeqpfdlcfggrrqfckrsysdldsnredf 
dp ap vks k fds 1»d fdtll kq aq knlrr 


5991 


334 


1379 


RLSSHFSQCS PS I YC\TKFDKGGNVTS FERKKTELYQELGLQAR 
DLRFQHVMS I T VRNWR 1 1 MRME YLKAV I TP ECLLI LDY RNLNLK 
QWLFRELPS0LSGEG0LVTYPLPFEFRAIEALLQYW1NTL0GKL 
S I liQ PLI LE TLDALGDP KHS S VDRS KI»H I LLQNGKS LS ELE TD 1 
.KIFKESILEILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLENY YRLADDLSNAARELRVLIDDSQS 1 1 F INLDSHRNVMM 
RLNLQLTMGT FS L S LFG LMG V AFGMNLESS LE EDHR I FWL I TG I 
MFMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGGIVEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGCATEWRPLRPWNGAMEKLRRVL 
SGQDD5E0GLTA0DS01NL/SEVLDASSLS FNTRLKVJFAI CFVC 
GVFFS I LGTGLLWLPGG I KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKMFEATRLLATI VMLLCFI FTLCAALWKHKKGLAVLFC I LO 
FLS MTWYSLSYI PYARDAVI KCCS SLLS 


5993 


1650 


594 


AEGLGSWAVWAGLGWAGRHMEAGGATGALGVGCKLPSAFCFPGS 
SVAMDMFOKVEKIGEGTYGWYKAKNRETGOLVALKKIRLDLEM 
EGVPSTAI RE I SLLKELKH PNI VRLLDWHKERICLYLVFEFLSC 
DLKKYMDSTPGS EL PLHL I KS YL FQLLQG V£ FCHS HR VI HR DLK 
PONLLINELGAIKLADFGLARAFGVPLRTYTHEVVTLWYRAPEI 
LLATR F YTTAVD T W S I GC I FAEMVTR KALF PGDS \ E I DQ\ L FR I 
FRMLGTPSEDTWPGVTOLPDYKGSFPKWTRKGLEEIVPNLEPEG 
RDLLMQLLQ YDP S OR I T AKTALAHP YFS S PE P S PAARQ YVLQRF 
RH 


5994 


394 


1934 


AGEVOLKVWIRGWRIQPO/KAAAIIDLDPDFEPQSRPRSCTWPL 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
POPG I LG AVTGPRKGGS RRNAWGNQS YAEL ISQAI ESAPEKRLT 
LAQI YEWMVRTVPYFKDKGDSNSSAGWKNS IRHNLSLHSKF I KV 
HNEATGKSSWWMLNPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKP SGLP A P P EGATP TS P VGH FAKWSGS PCSRNR E EADM WTT 
FRPRSSSNASS VSTRLS PLRPESEVLAEEI PASVSS YAGGVPPT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLFSPAEGPLSAGEGCFSSSOALEALLTSDTPPPPADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SMI APPPVMASAPI PKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MEKLECDMDNI I S DLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPG PG P ASGA W LCTRARGS AAF VP P LP R P PSRG AR R RRRL PG R 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\RROELLEARF\TGLGVSKGPLNSESSNQSL 
CSVGSLSDKEVETPEKKONDORNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 

NO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid lament containing signal peptide 
(A^Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acic, ^Phenylalanine, G=Glycine, 
H=Histidinc, 3*Isoleucine, K=Lysine, 
L= Leu cine, r-> Methionine, N=Asparagine , 
F=Proline, C=Giutamine, R=Arginine, 
S=Serine, T- Threonine, v^valint, 
W=Tryptophar:, Y=Tyrosine, X=Unknown, *=Stop 
Ccdon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDVFERRVEOPI-VGLDGSAAKEATEEQSAIjPTLMSVMLAKPRL 
DTEQLAQRGAGLC FTFVSAOQMSPSSTGSGMTEHSCSSOKOI SI 
QHRQTNQSDLTjEKISALENSKNSDLEKKEGRIDDLLRANCDLR 
RQl\DEQQKMLEf;YK\ERLNRCFDNEPRNFLIEKSKQEKi v iACRD 
KSMODRLRLGHF'j-I'VRHGASFTEOWTCGYAFONLIKQOERINSQ 
REEIERQRKML^.fCRKPPAMG0APPATNEQK0RK5KTNGAENETL 
TLAEYHEQEEIFKLRLGHLKKEEAEIQAELERLERVRNLHIKEL 
KR I KNEDNSOFKTKPTLNDRYLLLHLLGRGGFSEVY KAFDLTEQ 
RYVAVKIHQLNKKWRDEKKENYHXHACREYR 1 HKELDHPRI VKL 
VDYFSLDTDSFCTVbEYCEGNDLDFYLKQHXLMSEKEARSlIMO 
I VNALKYLNE I K F ? I IHYDLKPGNI LLVNGTACGE 2 KI TDPG LS 
KI MDDDSYWS VDC-KELTSQGAGTYWYLPPEC FWGXEPPXI SNK 
VD V w S VGV I F YQCLYGRK P FGHNQ SQQD I LQEN T2 LKATE VOF P 
PKPWTPEAKAF] KRCLAYRKEDRIDVQQLACDPYLLPH I RKSV 
STSSPAGAAIASTSGASNNSSSN 


S996 


1612 


981 


DQOACLLG LMLTljE FG ILE FDPS W I GSWTQR / S WVS WRS R PGCE 
LFSIWFGSIVNEGYLNSASEGEEFCIYNRNPNACSYGVAVGVL 
AFLTCLL YLALDVY F PQ I SS VKDRKK\ AVLSGH P WSGEPH PAA 
FWA?LWFTGDSCYL\ANQWQVSKPKDNP1.NEGTDASPGRPSPFS 
FFSlFTWSL7AAi,AVRRFKDLSFQEEYSTLFP\ASAQP 


5997 


1612 


981 


DQOACLLGLMLTLEFGILEFDPSVIIGSWTQR/SWVSWRSRFGCE 
LFS I WFGS I VNEGYLNSASEGEEFC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGKPWSGEPHPAA 
FWAFLWFTGDSCYiAANQWQVSKPKDNPLNEGTDASPGRPSPFS 
FFSI?TWSLTAALAVRRFKDLSFOEEYSTLFP\ASAQP 


5998 


1612 


981 


DOOACbLGLMLTLEFGIbEFDPSWIGSWTUR/SWVSWRSRPGCE 
LFSI WFGS I VN EG YLNSASEGEEFCI YNRN PN ACS YGVAVGVL 
AFLTCLLYLALD\A'FPQISSVKDRKK\ AVLSGH P WSGEPH PAA 
FWAFLWFTGDSCY1\ANQWQVSKPKDNPLNEGTDAS PGRPS PFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLFP\ASAQP 


5999 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIVWGFHHKKGCOVEFSYPPLIP 
GIX5HDSHTLPEEWKYLPF1ALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFG I SCYR\01 Z AKALKVROAD I TR ETVQKS VCVLSKL PLYG 
LLOAKLQLI THAY FEEKOFSQISILKELYEHMNSSLGGASLEGS 
OVYLGLSPRDLVLK FRHKGLILFKLI LLEKKVLFYI SPVNKLVG 
ALMTVLSLFPGM1EHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTK'^GTIRKVMAGNHGEPAAMKTEEPLFOVEDSS 
KGOEPNDTNQYLK?PSRPSPDSSESDWETLDPSVLEDPN]jKERE 
QLGSDOTNLFPKDF VPSESLPITVQPQANTGQVVLI PGLI SGLE 
EDQYGKPIiAIFTKGYliCLPYMALQOKHXjLSDVTVRGFVAGATNI 
LFROOKHLSPA I VE V EEAL I Q I HD P ELRKLLN PTTADLR FADYL 
VRHV7ENK UOV FJL> DGTGWEGGDEW I RAQ FAVY I KALLAATLQLV 
LFRIVNVAKK1G^/MVTT\SRKW0TGK\AVG0SVGGAFS\SAK 
TA\MSSWLSTFTTE TSQSLTEPPDEKP 


6000 


101 


1561 


TEPCRTAENCTATKSE^3NKNSLESSLRQLKCHFTWNLMEGENSL 
DDFEDKVFYRTEFCNREFKATMCNLLAYLKHLKGQNEAALECLR 
KAEEL1 QQEHAD0AEIRSLVTWGNYAWVYYHMGRLSDVQ3 YVDK 
VKHVCEKFSSPYR : KSPELDCEEGKTRLKCGGNQNERAKVCFEK 
ALEKK PKNPEFTSGLAIAS YRLDNWPPS QNAI DPLRQAIRLN PD 
NQ YLKVLLALKLH KM R EEGEEEGEGEK \ L VEEALEKA PG \ VTD V 
LRSAA\KFYRGKDEPDKAIELliKKALEYIP\NNAYLHCQ2GCCY 
RAKVFQVM^RENGWYGKRKLLELIGHAVAHLKKADEANDNLFR 
VCSI LASLHALADC YEDAEYYFQKEFS KELTPVAKOLLHLRYGN 
FQLYOMKCEDKAIHKFIEGVKINQKSREKEKMKDKLQKIAKMRI, 
SKNGADSEAI>HVLAFLQELNEKMQQADEDSERGLESGSLIPSAS 
SWNGE 


6001 


176 


1038 


AFAHS PSRGHRKTE 1 HTPRHTPRCTMAESHLQSSLI TASQFFE I 
WLHFDADGSGYLEGKELQNLIQELGX?ARKKAGLELSPEMKTFVD 
QYGCRI)DGKlGIVELAHVLPTEENFLLLFRCQOLKSCE\EFMKT 
WRKYDTDHSGFI ETE ELKNFLKDLLEKANKTVDDTKLAE YTDLM 
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SEC 1 
ID 
NO: 


Predicted 
beginning 
nucleotide 
loca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Arr^no acid segment containing signal peptide 
<A=Alanine, C=Cysteine, DsAspartic Acid, E= 
Glutamic Acid, F^Phenylalanine , G=Glycine, 
K=Kistidine, I=Isoleucinc, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T= Threonine , V^Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown , *=Stop 
Cocon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKL FDS NNDG KLE LTEMAR LL PVQEN FLLK FQG 3 KM CG KE FNKA 
FE1YDCDGNGYIDENELDALLKDLCEKNK0DLDINNITTYKKNI 
MALS DGG KLYRTD LAh I LCAG DN 


6002 


977 


81 


LAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLHPHS 
SMAORSDLLELDCCLTRDRWWSHDENLCROSGLNRDVGSLDF 
EDL-PL.YKEKLEVY FS PGHFAHG SDRRMVRLEDLFQRFPRTPMS V 
E I KGKNEELJ REQ/ VLVRRYDRNEITI WAS E KSS VMKXCKAANP 
EMPLSFTISRGFWVLLSYYLGLLPFIPIPEKFFFCFLPNIINRT 
YFPrSCSCU^QLbAWSKWLIMRKSLIRHLEERGVQWFWCLNE 
ESDFEAAFSVGATCVITDYPTALRHYLDNHGPAARTS 


6003 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
AP KTSGNP AN S ARKPG S AGG P KVG AG AS KEGG AGAVDE DDF 1 KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANAZtKKIR 
SliLVAGAAOYDCFFOHLRLLDGALKLSAKDLRSQWREACITVA 
KLSTVLGNKFDHGAEAIVPTLFNLVPNSAKVKATSGCAAIRFII 
RHTHVPRL1PLITSNCTSKSVPVRRRSFEFLDLLL0EWQTHSLE 
RKAAVLVETI KKGI HDADAEAR VEARKTYMGLRNHFPGEAETLY 
KSbEPSYOKSLQTYLKSSGSVASbPQSDRSSSSSQESLNRPFSS 
KWSTANPST VAGRVSAGSS KASSLPGSLQRS R S DIDVNAAAGAK 
AHHAAGOS VR SGRLGAGALNAG S YAS LEDTS D KLDGTAS EDGRV 
RAK LS AP LAGMGMAKADSRGRE RT KMVSQ SO PG SRSG S PGRVLT 
TTALSTVSS GVORVLVNSASAOKRSK I PRSOGCS REASPSRLSV 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALY APEVYGASGPGYGI SOS SRLS SSVSAMR VLNTGSDVEEA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSSRNGSI PTYKROT \EDV\AEVLNRCAS5N WS ERKEGLLGLQN 
LLKNORTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKI)DI,0DWIvFVLLTOLLKKMGADLLGSV0AKVOKALDVTRES 
FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKOMDPGD 
F1NSSETRLAVSRVITWTTEPKSSDVRKAA0SVLISLFELNTPE 
FTKLLGALPKTFQDGATKLLHNHLRNTGNGTOS SMGSP LTRPTP 
RS FAN WS S PLTS P TNTS QNTLS PSAFDYDTENMNS EDI YSS LRG 
VTEA10NFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSOTAL\ DNKASLLHSMPTHS SPRS RD YNPYN YSDS I S 

VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
I RALALKVLRE I LRHQPARFKN YAELTVMKTLEAHKDPHKEWR 
SAEE AAS V\ LATS I \ SPEQC I KVLCP 1 1 QTADYP INLAA1 KMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDKSESSVRKACVFCLVAV 
HAV2 GDELKPHLSQLTGSKMKLLNLYI KRAOTGSGGADPTTDVS 
GQE 


6004 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
APKTSGNPANSAR KPGS AGGPK VGAGASKEGGAGAVDEDDF I KA 
FTDVPS1QIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQYDCFFCHURIXDGAI.KLSAKDLRSOVVREACITVA 
HLSTVLGNKFDHGAEAI VPTLFNLVPNSAKWiATSGCAAI RFI I 
RHTKVPRLI PLITSNCTSKSVPVRRRS FEFLDLkLQEWOTHSIjE 
RHAAVLVETI KKG I HD ADAEARVEAR KT YMGLRNH FPG EAETliY 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KKSTANPSTVAGRVSAGSSKASS LPGSLQRSR SDI DVNAAAGAK 
AHHAAGQSVR SGRLG AG ALNAGS YAS L.EDTSD KLDGTAS EDGRV 
RAKLSAPIiAGMGMAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKIPRSOGCSREASPSRDSV 
ARSSRIPRPSVSOGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAHRVLNTGSDVEEA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSSRNGS I PTYMRQTN EDV\AEVLNRCASSNVJS£RKEGLLGLQN 
LLKNORTLSRVELKRLCEIFTRNFADPHGKRVFSMFLETLVDFI 
Q\rHKDDU3DWI>FVLI,TQLLKKMGADLI^SVOA}CVQKALDVTRES 
F PNDL.QFN I LMR FTVDQTQTPS LKVKVAI LKY I ETLAKQMDPGD 
FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
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ID 
NO: 


Predicted 
beoinninc 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Amine acid segment containing signal peptide 
(A=Alonine, C=Cysteine, D=Aspartic Acid, Ee 
Glutamic Acic, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lyeine, 
L=Leucine, K=Methionine, N«Asparagine, 
P=Proline t Q-Glutamine f R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyros ine, X=Unkncwn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMLLGALPKTFCIX^ATKL-LHKHLRNTGNGTOSSMGSPLTRPTP 
P.SPANWSSPLTSFTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 
VTEAICNFSFRSCEDMNEPLKRDSKKDDGI>SMCGGPG\MSDPRA 
GGDATDS S QTAL\ DN KAS LLK SMPTHSSPRS RDYN P YNYSDS I S 
PFNKS ALKEAMFDDDADQFPDDLS LDKS DLVAELLKELSNHKER 
VEER KI AJjYELM KLTOEES FS VWDEHFKTI LLLLLE TLGDKE PT 
1RALALKVLREILRH0PARFKNYASLTVMKTLEAHKDPHKEVVR 
SAEEAASV\LAT£1\SPEQCJKVLCPIIQTADYPINLAAIKKQT 
KVIERVSKETLNLLLPEIMPGLIOGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHLSOLTGSKKKLLNLYIKRAQTGSGGADPTTDVS 
GOS 


6005 


133 


5955 


rssgrrqeqlgqfpgrerkgmasglgspspcsagseeedmdall. 
nnslppphpeneedpeedlsetetpklkkkkkpkkprdpkipks 
krqkkermllc30lgdssgegpefveeeeevalrsdsegsdytp 
gkkkkkklgpkkekkskskrkeeeeedddddddskepkssaqli* 
edwgmedidhvfseedyrtltnykafsqfvrpliaaknpkiavs 
kmmmvlga kwr efs tnnpfkg ss gas vaaaaaaavawes mvta 
tevapppppvevpirkaktkegkgpnarrkpkgsprvpdakkpk 
p kkvapl-k i klgg fg s krkrs ss edddldv e sdfddas in5ysv 
s dg sts rssrs r k klrttkkkkkgeeevtavdg yetdhqd ycev 
cq0gge1ilcdtcprayhmvcldpdmekapegkwscphcekegi 
cweakednsegeeileevggdleeeddhhmefcrvckdggellc 
cdtcpssyhihclnpplpeipkgewlcprctcpalkgicvqkili 
vvkwgoppsptpvprppdappntpspkplegrperoffvkwogms 
ywhcswvselqlelhc\qvmfrnyorkndmdeppsgdfggdeek 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMlHRlliNHSVDKKG 
HV1I YLI KWRDLPYDQAS WESEDVEI ODYDLFKQS YWNHRELMRG 
EEGRPGKKLKKVKLRKLERPPETPTVDPTVKYERQPEYLDATGG 
TLHPYQMEGLNWLRFSWAOGTDTILADEKGLGKTVQTAVFLYSL 
Y KEGH S KGP FLVS AP LSTI I N \ WE R E FEMW APDM YV \ VT Y VGDK 
DSRAI I R EKEFS \ FEDNAI RGGKKAS RMKKEASVKFH VLLTS YE 
LI T I DMA I LGS I D W ACL I VDE AHR L KNNQS KF FR VLNG YS LQHK 
LLLTGrPLQNWLEELFHLLNFLTP ER FHNLEGFLEEFAD1 AKED 
OIKKLHDMLGXPHMLRRLKADVFK-NMPSKTELIVVRVELSPMVQ 

kkyyk\ y i lhs kflkaln\ argggnq vsllnwmdlkkccnhpy 
lfpvaameapkmpngmydgsalirasgkllllokmlknlkeggh 
rvli fsqmtkmldlledflehegy kyeridggitgnmrqeaidr 
fnapgaqqfcflle tragglg i nlatadtvi i ydsdwnphndi q 
afsrahr i gqnkkvm i yrfvtras ve er i tovakxkmm lthlw 
rpglgsktgsmskqelddilkfgteelfkdeatdgggdnkeged 

SSV1HYDDKAIERLLDRN0DETEDTELOGMNEYLSSFKVAQYW 
REEEMGEEEEVEREIIKQEESVDPDYWEKLLRHHYEQQQEDLAR 
NLGKGKR I RKQVNYNDGSQEDRDWQDDQSDNQSDYS VASEEGDE 
DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 
OR KAFLN AI MR YGMP PQDAF TTQW LVRDLRGKS E KE FKAY VSIiF 
MRHLCEPGADGAETFADGVPREGLSRQHVLTRIGVMSLIRKKVQ 
EFEHVNGRWSMPELAEVEENKKWSOPGSPSPKTPTPSTPGDTQP 
NTPAPVP PAEDG I KI EENSLKEEES 3 EGEKE VKSTAPBTAI ECT 
OAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 
AADVEKVEEKSAIDLTPIWEDKEEKKEEEEKKEVMLQNGETPK 
DLNDEKQKKNIKQR FMFNI ADGG FTELHSLWQNEERAATVTKKT 
YEIWHRRHDYWLLAGIINKGYARWGDIONDPRYAILNEPFKGEM 
NRGNFLEI KNKFLAR RFKLLEQALV1 EEQLRRAAYLNMSEDPSH 
PSMALNTR FAEVECLAESHQHLS KESMAGNKPANAVLHKVLKQL 
EELLSDM KA D VTRLP AT I AR I P ? VAVRLQMS ERN I LS RLANRAP 
EPTPQQVAQQQ 


6006 


1 


965 


DNDFLRNTVHRHEPPVTAEPIRLLAENEDWWDKPSSIPVHPC 
GRFRHNTVIFILGKEHQLKELHPLHRLDRLTSGVLMFAKTAAVS 
ERIHEOVRDRQLEKEYVCRVEGEFPTEEVTCKEP2LWSYKVGV 
CRVDPRGKPCETVFCRLSYNGOSSWRCRPLTGRTHOIRVHLQF 
LGHPI LND P I YNS VAWGPSRGRGG Y 1 PKTNEELLRDLVAEHQAX 
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SEQ 
ID 

NO: 


Predictec 
becinninc 
nucleotide 
Iccation 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca tier, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
'A=Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
K=Histidine, 1=1 soleucine, K=L»ysine, 
L>= Leucine, M=Methionine, N=Asparagine , 
F=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stcp 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








0SLDVLDLCEGDL»SPGl»TDSTAPSSELGKDDLEELAAAA\OKME 
EVAEAAF0E1>DTIPJLASEKAVE7DVMNQ\RQT\TLCRVPAGATG 
SLAPRPCDVPTCPTL 


6007 


3 


225j 


KELGQVEYVFTDKTGTLTENEMOFRECSlNGMKyQElNGRLVPE 
GPTPDSSEGNLSYLSSLSHLNNLSHLTTSSSFRTSPENETELIK 

ehdlffkavslchtv01iwv0tdctgdgpwqsnlaps0leyyas 
spdekalveaaarig1vfignseetmevktlgklerykllh1le 
fdsdrrrmsv1vqapsgekllfakgaessilpkciggeiektri 
kvdefalkglrtlciayrkftskeyeeidkrifeartalqqrXe 
eklaavfcfi ekdlillgatavedrlqdkvreti ealrmag1 kv 
wvltgdkhetavsvslscghfhrtmnilelimqksdsecaeqlr 
qlarr i tedhvi qhgl wdgtslslalr eheklfme vcrncs av 
lccrmaplqkakv irl 1 ki s pe kp1 tlavgdgandvsm 1 qe ah v 
gigimgkegroaarnsdyaiarfkflskllfvhghfyyikiatl 

VOYFFYKNVCFITPQFLYOFYCLFSQQTLYDSVYL.TL>Y\NICFT 
SLP ILI Y SLLEQHVDPKVLQNKPTLYRDISKNRLLSI xtflywt 
1LGFSHAF3FFFGSYLLIGKDTSLLGNGCMFGNWTFGTLVFTVM 
V I TVTV KMALE7HF WT W 1NHLVTWGS 1 1 FYFVFS LF YGG I LWPF 
LGSQWMYFVFlQLZiSSGSAWFAI ILMWTCLFLDI IKKVVDRHh 
HPTSTEKAQLTETMAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRCSPTHISRSWSASDPFYTNDRSILTLSTMDSSTC 


6008 


4554 


1085 


AGVRRAGARRGPGRALFAGATAVPPPSARRRRRCPAPEHAGPAR 
ASR P SQE TMFQLP VNNLGS LR KAR KTVX K.I L»SD I GLE Y CKEH I E 
D FKQ FE PND F YLKN TT W ED VG LWDPS LTKNQDY R T KP FCCS ACP 
FSSKEFSAYKSHFRNVHSEDFENRILtNCPYCTFNADKKTLETH 
J KI FHAPNASAPSSSLSTFKDKNKNDGLKPKOADSVEQAVYYCK 
KCT YRDPLYE I VRKH I YREH FQHVAAP Y I AKAGE KS LNGAVPLG 
SNAREES S I H CKRCLFMPKS YEALVQHV I EDHER I G YQVTAMI G 
HTNVWFRSKFLMLIAPKPQDKKSMGLP PRIGS LASGNV\RS LP 
SQQMVNRLSI PKPNLN STGVNMMSS VHLQQNNYGVKS VGCGY SV 
GOSMRLGLGGNAPVS I PQQSQSVKQLLPSGNGRSYGLGSEQRSQ 
APARYS LQS ANAS SLSSGQLKS PSLS QSQ ASRVLGQS SSKPAAA 
AT G PPPGN TS STQKW K I CT I CNELFPENVYS VH FE KEHKAE KVP 
AVANYIMK I HNFTSKCLYCNRYLPTDTLLNHMLIHGLSCPYCRS 
?r NDVEKMAAHMRM VH I DEEMGPKTDSTLSFDLTLQQGSHTN2 H 
LLVrrYNLRDAPAESVAYHAQNNPPVPPKPQPKVOEKADIPVKS 
S PQAAVPY KKDVGKTLCPLCFS ILKGPI SDALAHKLRERHQV 2Q 
•P/HPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKTON 
GODKTNAP SRLNQS PSLAPVKRTYEQMEFPLLKKR KLDDDSDS P 
SFFEEKPEEPWLALD?KGH\EDDSYEARKSFLTKYFT\KOPYP 
TRREIEK1AASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
LG FNMKE LN KVKH EMDFDAEGLFENHDEKDSRVNAS KTADK KLN 
LG KEDDS S SDS FENLEE3SNES GSPFDPVFEVEPKI SNDNPEEH 
VLKVIPEDAS ESEE KLDQKEDG S KYET1 HLTEEPTKIjMHNASDS 
E VDQDDWE W KDGAS P S ES G PGS QQ VSDFEDNTCEMK PGTWSDE 
S SQSEDARSS KPAAKKKATMQGDREOUCWKNSS YGKVEG FWS KD 
CSQWKNASEKDERLSNPQIEWQNSTIDSEDGEQFDNMTDGVAEP 
MHGSLAGVKLSSQQA 


6009 


4272 


1534 


CHGLQHLTFFRELNLSLOG*EPH*AA*QAVRSEEKSIC+GSPSC 
K LVLGVLVP VARQSS HS AG PAQSAFR+ TGTGSGTPKAAEQSGYW 
EAYTLGHOHVWMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGBQASQRRTVFTAGGGECLGAKSVRASVFTGNQPGVI1GLI* 
NGXRGGCFESGYLFGFIVIGKIOSLEAKVPbP^/NGQTGERASPG 
NCRIHI VDAVC* SEHK* DHFLAAAFLENSTI IS * VAPGSWQDHA 

F VLAPQDGEG V P FVEGQLVTVLG LVVPQS I RHTFVHHTQL FLHP 
I * KLGALDVAFLHLLTLVCS S FNVAYG *GKNGGTTLHQLFAEVN 
AVTRGSAVORRPSITISSIHVDTKIQQELKDVMVAGADGWQWG 
DFF WGLAGI FHLI DDPLHQIELS FQRRV* EQCQGVKPDSQPVP 
RFLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPl^RWGLSHRT 
RDl)LRGGDRGHWVIVLCRLGSLVGGLGTDELLWFGGR*LII IG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
loca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Precictec end 
nucleotide 
location 
cor r e spondi ng 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D~Aspartic Acid, E*= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=IIist idine, Islsol eucine, K-l>y£>ine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glut amine, R-Arginine, 
S=Serine, T=Threomne, V=Valine, 
K=Tryptophan, Y=7yrosine. X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








3 * * RGR LS G E WGCGLGRGE L FQ VS I G 3 G VS I VH I GOGDHE V LGG 
AGL VERG ALHATGQG VE ALVQO LLDVG PAG ALGLCDGAAL FOG P 
GRVGOLPAEGLOVCITLVAOWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQY FLKGG * RLWCARGQ* PVKKRQRRWRG* TR 
R * NGLT 3 HCFN * L I *GAVCCRLVILR WCGLLEVHG VYGT* I HCL 
GSFPGRLWP* PFI SQERPKGHCQWE FRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL*GAWLGGAACGGOOGoPLSTWQACTGPGOAAF 
LP P FOG ACRPRTQRCRTWVCP I AV1RQLLAYTRD 


6010 

i 

i 


1 


3533 


IMPCGS SRLLRGCWTHPNE PVSDLS Y FDC3 ESVMENSKVLGESK 
AGI SQNAKTGDLPAFGECVG3 ASKALCGLTEAAAQAAYLVGIFD 
PKSQAGHOGLVDPIOFARANQAIQMACQNLVDPGSSPSQVLSAA 1 
TIVAKHTSALCNACRIASS KTANPVAKRHFVQSAKEVANSTANL 
VKTI KALDGD FS E DNRNKCR 3 ATAP L I EAVENLTAFASN P E F VS 
IPAQI SSEGSQAOEPI LVS AKPMLESS SYLIRTARSLAINPKDP 
PTWSVLAGHSHTVSDSI KSLITS3 RDKAPGQRECDYSIDGINRC 
IRD I EQASLAAVSQSIiATRDDISVE ALQEOLTSWOEIGHLI DP 
I ATAARGE AAOLGHKGTOLAS Y FEPLI LAAVGVAS KI LDHOQOM 
TVLDQTKTLAES ALOMLYAAKEGGGNP KAQHTHDAI TEAAOLMK 
EAVDDI MVTIiN F.AASEVGLVGGMVDAI AEAMS KLDEGTPPE P KG 
TFVDYQTTWKYS KAIAVTAQEMMTKS VTNPBELGGLASQMTSD 
YGKLAFQGQMAAATAEPEE1 GFQIRTR VQDLGHGCI FLVOKAG\ 
ALQVC PTDS YT KR ELI ECARA VTE KVS L VLS ALOAGNKGTO ACI 
TAATAVSGI 3 ADLDTT3 MFATAGTLNAENSBTFAJDHREN3 LKTA 
KALVEDTKLLVSGAASTPDKLiAQAAOS SAATI TOIAEWKLGAA 
SLGS DDPETOWL INA1 KDVAKALSDL ISATKGAAS KPVDDPSM 
Y01> KG AAKVMVTNVTS LLKT V KAVEDEATRGTRA LE AT I EC I KO 
ELTVFQS KDVPEKTSS PEE S I RMTKG I TMATAKAVAASNS CROE 
DVI ATANLSF KAVSDMLTA CKQAS FK PDVSDEVRTRALRFGTEC 

fpipvT Tyr t rmrr ut* ntfDTBrT irrvM a t\ ire vdva r* a irrci inas 
1 JLaj X JUI^JjLiCiil V i_»V .l^jyA.F J ^r.Lils.yyjLiA>\r ol\K V/i\j/W JL £,L>± 

EAMKGTEWVDPEDPTVIAETELLGAAASIEAAAKKLEOLKPRAK 
PKQADETLDFEE03 LEAAKS I AAATSALVKSASAAQRELVAOGK 
VGSIPANAADDG0WSQGL3 SAARMVAAATSSLCEAANASVOGHA 
S EEKL3 SSAKQVAAS TAQLL VACKV KADQDS EAMRRLQAAGNA V 
KRASDKLVRAAQKAAFGKACDDDVWKTXFVGG 1 AQI I AAOEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6011 


446 


1835 


LLQPAMRKSPGLSDCLWAW3LLLSTLTGRSYGOPSLODELKDNT 
T VFTR 3 LDRLLDG YDNRLR PGLGERVTE VKTDI FVTS FGPVS DH 
DMEYT3 DVFFRQSWKDERLKFKGPMTVLRbNNLMAS KIWTPDTF 
FHNGKKS VAHNMTM PNKLLR I TSDGTLLYTMRLTVR \ AECPMAF 

GSRXJ^OYDLbGQTVDSGIVOSSTXSEYVVMTTHFHLKRKIGYFVI 
QTYLPC1 MTV1LSQVS FWLNRESVPARTVFGVTTVLTMTTLS 3S 
ARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYA 
WDGKS WP EKPKKVKDPLI KKNNTYAPTATS YTPNLARGDPGLA 
T1AKSAT I EPKEVKPETKPPE PKKTFK SVS K1DRLSRI AFPLLF 
GI FNLVYWATYLNREPQLKAPTPHQ 


6012 


351 


5013 


PAELFOSFAIWHKELYDWRLGPWNQCOPVISKSLEKPLECIKGE 
EGIQVREIACIQKDKDIPAEDHCEYFEPKPIiLEQACLIPCOOD 
CIVSEFSAWSECSKTCGSGLOHRTRHWAPPQFGGSGCPNLTEF 
Q VCOS S P CEAEELR Y S LHVG P WSTCS M PHS RQ VROARRRG KNKE 
REKDRSKGVKDPEARELIKKKRNRNRQNRQENKYWDIQIGYQTR 
EVMCINKTGKAADLSFCQOEKLPMTFOSCVITKECQVSEWSEWS 
PCSKTCHDM^/SPAGTRVRTRTIRQFPIGSEKECPEFEEKEPCLS 
OGDGWPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTAIjCGGG 
I OTR E VY CVO ANENLLSQLSTH KNKE AS K PMDLKLCTG P I PNTT 
OLCH 1 PCPTECEVS PWS AWG PCT YENCNDQQGKKGFKLRKRR 3 T 
NEPTGGS GVTGNCPHLL&A1 PCEEPACYDWKAVRLGDCEPDNGK 
ECGPGTQVQEWCINSDGEEVDRQLCRDAI FP 1 PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARSILAYAGEEGGIRCP 
NSS ALOE VR S CNEH P CTVYH WQTG PWGQCI EDTS VS £ FNTTTTW 
NGEASCSVGMQTRKV1CVRVNVGQVGPKXCPESLRPETVRPCLL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing signal peptide 
<A= Alanine , C=Cys teine , D=Aspartic Acid, E= 
Glutamic Acid, F=?henylalanine , G=Glycine., 
K=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline , Q=G3ut amine, R=Arginine, 
S=serir.e, T=Tfcreonine, V=valine, 
W=7ryptophan, V=Tyrosine, X- Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKKDCIVTPY3DW?SCPS\SCKEGDSSIRKQSRHRVIIQLFAN 
GGRDCTDPLYESKACBA.POACQSYRW\KTKKW\HRCQ\LVP\WS 
VQQDSP\GAOEGCGPGROARAITCRKGDGG0AGIHECLQYAGPV 
PALTQACQ1PCQDDCQLTSWSKFSSCNGDCGAVRTRKRTLVGKS 
KKKEKCKNSHLYPL1ET0YCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGDIKECGCGYRYQAMACYDQNGRliVETSRCNSHGYI 
EEACIIPCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLREKPYN 
GGRPCPKLDHVNOAQ^EWPCHSDCNQYLVT/TEPWSICKVTFV 
MMRENCGEGVQTRKVRCMONTADGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDCVI SEWG PWTQCVLPCNQS S FRQRSADP I RQPADE 
GRSCPNAVEKEPCNLKKNCYHYDYNVTDWSTCQLSEKAVCGNGI 
KTRMLDCVRSDGKSVPLKYCEALGLEKNWQMNTSCMVECPVNCQ 
LSDWS PWS ECSOTCGLTG XM I RRRTVTQP FQGDGRPCP S LMDQS 
KPCPVKPCYRWOYGOWSPCQVQEAQCGEGTRTRNISCWSDGSA 
ED FS KWDE E FCADIELII DGN KNM VLE E S CSQPC PG DCY L KDW 
SS W S LCQLT C VNGED LG FGG 1 Q VRS R P V 1 1 QE LENQK L C PEQML 
ETKSCYDGOCYEYKWr-iASAWKGSSRTVWCORSDGXNVTGGCLVM 
SQPDADRSCNPPCSQFHSYCSETKTCHCEEGYTEVMSSNSTLEQ 
CTblPVVVLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFIjQ 
FFGPDGRLKTWVYGVAAGAFVLLI FI VSN1 YLACKKPKKPQRRQ 
NNRL KPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC 
SI VDVEFLPVYHPSPE ESRDPTLYANNVQRVMAQALGI PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRKLCGRWGR 
ARPESKDQPGRVCQAATAL 


6014 


2fl57 


613 


EAVAGGMEKSRMNLP KG PDTbCFDKDSFKKL'DFDVDHFVSUCKK 
RV01>EELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLVGK 
DKALNOLSVPLGQLREEVLSLRSSVSEGIRAVDERMSKQEDIRK 
KKM CVLRL3 QVI RS VE KI EK1 LNSQSS KETS ALEASS PLLTGQI 
LERIATEFNQLQFHACQSK\GMPLLDKVRPRIAGITAMLQOSLE 
GLLLEGLQTSDVDI I RHCLRTYATIDKTRDAEALVGQVLVKPYI 
DEVI I EQFVE SH PNG LQ VM YNKLLE FVPHH CRLLREVTGGA I S S 
EKGNT VPG YD FL VNS VW PQ I VCGLE E K LPS LFW PGNPD AFH E KY 
TISMDFVRRLEROCGSOASVKRLRAHPAYKSFNKKWNLPVYFQI 
RFREIAGSLEAALTDV1.EDAPAESPYCLLASHRTWSSLRRCWSD 
EMFLPLLVHRLWRLHSGRFWARYSVFV\N\ELSLRPISNESPKE 
IKKPLVTGSKEPS ITCGNTEDQGSGPSETKPVVS I SRTQLiVYW 
ADLDKLQE0LPELLE1 J KPKLEMIGFKNFSSISAALEDSOSSFS 
ACVPSLSSK1 1 QDIiSDS C FGF LKS ALE VPRLYRRTNKE V PT TAS 
SYVDSALKPLFQLQSGKKDKLKQAIIQQWLEGTLSESTHKYYET 
VSDVLT3SVKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKIRL 
QLALDVE YLGEO I QKLGIjQASDI KSFSAIAELVAAAKDQATAKQ 
P 


6015 

1 

i 

i 

i > 
( 6016. 


13 


2237 


AEGCAERRCTEPWELSMSWSSGAGPGLGSC>GMDLVWSAWYGKC 
WGKGSLPLSAKG1VVAWLSRAEWDOVTVYLFCDDHKLQRYALN 
RI TWJR S RSGNELPLA VAS TADLI RCKLLDVTGGLGTDELRLL Y 
GMALVRFVNLI SERKTKFAXVPLKCLAQEVNI PDWIVDLRKELT 
HKKMPHINDCRRGCYFVLDWt-OKTYVyCROLENSLRETWEiiEEFR 
EGIEEEDOEEDKNIWDD1TEOKPBPQDDGKETESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAIKAV7NNPSPRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEOLAALQI E YEENVDLNDVLVPKPFSQFWOPLLRGLHS ON FTQ 
ALLERMLSELPALGI SG I RPTYILRWTVELI VANTKTGRN ARRF 
S AGQWEARRGWRLFNCS AS LDWPRMVESCLGS PCWAS PQLLR 1 1 
F\KAMGQGL0DE\EOEKLLRICSIYTOSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEENBDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QESFTAENARLI^0KRGALOGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHV1FMTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFLLEARGSLH\GL\KTGLQLF 




13 


L 2237 


AEGCAERRGTEPWELSMSViESGAGPGLGSOGMDLVWSAWYGKC 
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ID 

NO: 


Fredictet 

beginninc 

nucleotic*- 

location 

cor re spending 

to first 

amino acid 

residue oi 

amino acic 

sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
anino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Aianine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine , 
H-Histidine, I =lsoleucine, K=Lysine, 
L= Leucine, MsMethionine, N=Asparaqint., 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPLSAHGIWAWLSRAEWDQVTVYLFCDDHKLQRYAIjN 
RITVWRSRSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLLY 
GMALVRFVNLI SERKTKFAKVPLKCIAQEVNI PDW I VDLRHELT 
HKKMFKINDCRRGCYFVLDWLQKTWCRQLENSLRETWELEEFR 
EG I E EE DOE EDKN I WDDI TEQ K PE PQDDG KS TES DV KA DG DS K 
GSEEVDSHCKKALSHK2LYERARELLVSYEEEQFTVLEXFRYLP 
KAIKAVCNNPSPRVECV1AELKGVTCENREAVLDAFLDDGFLVPT 
FEOLAAL0IEYEENVDLNDVLVPKPFSQFW0PLLRGLHSONFTQ 
ALLERMLSELPALG I SGI RPTY I LRWTVEL I VANTKTGRNARRF 
S AGOWEARRGWRLFNCS ASLDWP RMVESCLGS PCWAS P0LLR I I 
F\KAMGOGLODE\EQEKLLRICSIYTCSGENSLVOEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSBAKACQOEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEBNDDOEEEEEDEDDEDDEEEDRMEVGPPSTC 
OESPTAENARLLAGKRGALCGSAWQVSSEDVRWDTFPXLGRKPR 
S R PRT P AE L»M L ENY DTK V I FKT KPVL\EQR LE P STCK \ T DTLGL 
\SCGMGS \GNCSMSSSSNFRGAFLLEARGSLK\GL\KTGLQltF 


6017 


203 


3 4 69 


SHQEI EONSAMAPRKRGGRGISFI FCCFRNNDHPEI TYRLRNDS 
N FALQ7M E P AL PMP PV E ELDVWFS E LVDELDLTDKHREAM F ALP 
AEKKWQIYCSKKKDQEENKGATSWPEFYIDOLNSMAARKSLLAL 
E KEEEEERS KT I ESLK7ALRT KPMRFVTRF 1 DLDGLS C 1 LNFLK 
TMDY ETS ES R I HTSLI GCI KALMNNSOGRAHVLAHS ES I NV J AQ 
SLSTENI KTKVAVLE1 LGAVCLVPGGHKKVLQAMLHYQKYASER 
TR FQTLI NDLDXSTGRYRDEVSLKTA IMS Fl NAVLSQG AG VESL 
E FRLH LR YE \ FLMLG 1 H P VMDKLR KH ENSTLDR HLD F FEMLRNE 
DELEFAKRFELVHIDTKSATQKFELTRKRLTHSEAYPHFMSILH 
HCLQMPYKRSGNTVCYWLLLDRIlQOIVIONDKGODPDSrPLEN 
FN I KMWRM LVNENE VKQW KEQAE KMRKEHNELOQ KLE KKE R EC 
DAKTQEKEEMMOTLNKMKEKLEKETTEHKOVKQQVAELTAQLHE 
LSRRAVCAS I PGGPSPGAPGGPFPSS VPGSLLPPPPPPPLPGGM 
LPPPPPPLFPGGPPPPPGPPPU3AIMPPPGAPMGLALKKKSIPQ 
PTNALKSFNWSKLPENKLEGTVWTEI DDTKVFKI LDLEDLERTF 
SAY0R00DFFVNSNSKQKEADAIDDTLSSKLKVKELSV1DGRRA 
ONCNJLLSRLKLSNDEIKRAILTMDEQSDLPKDMLEQLLKFVPE 
KSDIDLLEEHKHELDRMAKADRFLFEMSRINHYQQRLQSLYFKK 
KFAERVAEVKPKVEAIRSGSEEVFRSGALKOLLEWLAFGNYMN 
KGQRGNA YGF K I S SLNK I ADTKSS J D KN I TLLH YL I TI VEN K Y P 
S VLNLNEELRDI PQAAKVNMTELDKE I STLRSGLKAVETELEYQ 
KSQPPOPGDKFVSWSQFITVASFSFSDVEDLLAEAKDLFTXAV 
KHFGEEAG KI QPDEFFG I FDQFLQAVS EAKQENENMRKKKEEE2 
RRARMEAQLXEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLS KLKRNRKR ITNQMTDS SRERP ITKI^F 


6018 


13 


2510 


T I SOS GG I RR R REAVWFE WNMDFS R LHM YS PPQC VP ENTG YT Y 
ALSS S Y S S DALDFETEHKLD P V FDS PRMSRRSLRLATTACTLGD 
GEAVGADS GTSSAVSLKNRAARTTKQRRS T8KSAFS JXHVSRQV 
TSSGVS YGGTVSLQDAVTRRPPVLDES WI REQTTVDHFWGLDDD 
GDLKGGNKAA I OGNGDVGAGAATGHNGFFCSNCNMLSERKDVLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
ACAGY FLLQI LRR IGAVGQAVSRTAWS ALW LAWAPGXAASG VF 
WWLG IG V? YQF VTL I S WLNVFLLTR CLR M I CKFLVLL I ?L FLLLG 
LSLRGOG\NFFSFLPVLNWASMHRTQRVDDPODVFKPTTSRLKQ 
PLQGDSEAFPWHWMSGVEQQVASLSGQCHHHGENLRELTTLLQK 
LOARVDOMEGGAAGPSASVRDAVGQPFRETDFf^AFHQEHEVRMS 
HLEDILGXLREKSEAI OKELEQTKOKTl SAVGEQLLPTVEHLQL 
ELDQLKSELSS WRHVKTGCETVDAVQERVDVOVREMVKLL FS ED 
QOGG SLEOLLORFS SQ F VS KGDLQTMLR DLQLQ ILRNVTHHVS V 
TKQLPTS E AWS AVSE AG ASG I TE AQARAI VWS ALKLY SODKTG 
M VD FALESGGG S I LSTRCS ETYETKTALMS LFGI PLW Y FS QS PR 
WIQPDI YPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHI PKTL 
S P TGNI S S A P KDFA V YGT i ENE YQEEGQLLGQFT YDQDG ES LQMF 
OALKRPDDTAFQIVELRIFSNWGHPEYTCLYRFRVKGEPVK 


6C15 


2 


1066 


TPNDREPPPQRPPSSRRASHLAQEITSAASLGDQTQILGSLTTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence. 


Predictec end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequenct 


Amino &c:d segment containing signal peptide 
(AsA]a:.:ne, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«=Phenylalanine, G=Glycine, 
H^His t i cine ( I«lGoleucine, X=Lysine, 
L=Leuc:nc, M=Methionine, N^Asparagine, 
?«=Proline, Q=Glutamine, R=Arginir.e, 
S= Serine. , T=Threonine, V=Valine, 
Ws Tryptophan, Y^Tyrosine, X=Un known, *-Stop 
Codcn, /^possible nucleotide deletion, 
Wpossibje nucleotide insertion) 








?VI TS A I R S MPG I SSQI LTNAQGQVI GTLPWWNS AS VAAPAPA 
OS LQVQ/ -.VTP0LLLNAOG0V3 ATLAS S PLPPPVAVRK\ PSTPES 
LLKSE VO r 3 KP TPTVPQPA WI AS PAPAAKPSAS AP I P I TCS ET 
PTVSQLVSKPHTPSLDEDGINLEEIREFAKNFK1RRLSLGLTQT 
0VG0AL7ATEGPAYSQSAJCRFEKLDITPKSAQKLKPVLEKWLN 
EAELRNQEGOQNLKEFVGGEPSKKRKRRTSFTPQAIEALNAYFE 
KM PLPTGOH I TEI AKELNYDRE WRVWFCNRRQTLKKTS KLNVF 
01 P 


6020 


49E; 


549 


EAIQFEVS 1 GNYGNKFDTTCKPLASTTQYSRAVFDGKYYYYLPW 
AHTKPW7L7S YW ED I SKRLDAVNTLLAMAERLQTN I EALKSG J 
OGKIPANOLAELKLKLIDEVIEDTRYTLPI.TEGKANVTVLDTQI 
RKLRSRSLSOI HEAAVRMRSEATDVKSTLAEI SDWLDKLMQLTE 
EPQNSKP 15 1 1. 1 KM I RGEKRLAYAR I PAHQ VL Y S TSG ENA SG KYC 
GKT0T3 FIX V PQEKUNGPKVPVELRVN I WLGLSAVE KKFNS FAE 
GT FTV F AE K V ENQ ALM FG KWGTSG LVG RH K F SD VTG KX KLKRE F 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGVJEVG 1 T I PPDHKPKS VIVAAEKMYHTHRRRRLV R KRKKD 
LTQTASSTAGA.MEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAP S ETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 
TTVFGAKTPIVSCKFDRDYIYKLRCYVYOARNLLALDKDSFSDP 
YAH I CFLKRS KTTEI IHSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
N P P K V I K E L F DN DQVG KDE FLGRS I FS P WK LNS EMD1 TPKLLW 
K P VMNGD KACG DVLVTAELI LRG KDGSN LP I LPPQRAPNLYMVP 
QGi RPVVCL7AIEILAWGLRNMKNFQMASI TSPSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFXPKEELYMPPLVIKV1DHRQ 
FGRKPWGQCTIERLDRFRCDFYAGKEDIVPQLKASLLSAPPCR 
D 1 V 1 EMELI'K PLLASKCLS SMSTALS X.MAS PATVHLTEKEEEI V 
DWWSKPYASSGEHEKCGQYIQKGYSKLKIYNCELEWAEFEGLT 
DFSDTFKLYKGKSDENEDPSWGEFKGSFR1YPLPDDPSVPAPP 
R0FREL PDS V PQ ECTVR I Y I VRGLELQ PQBNNGLCDPY I K I TLG 
KKVIE \ DRDK Y I PNTLNP VFGRMYELSCYLP0EKDLK1 SVYVYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\ 3 PEEYCVSGVNTW 
RDS LR \ PTC \ LLQNVARFKGFPQ P ILSEDGS R I R YGGRDYSLDB 
FEANK 1 LKCH LGAPEERLALHILRTQGLVPEHVETRTl MSTFQP 

MTC\RYVT,P\n TWtJTimVTT.nPKt;TTfiFFMQr)TYVVRiiil7 Pf^KJPP 
IX 1 O \t\ I I L)[\ v J J. 1 > IV J. I\U V J. XJLJCIvO JL 1 U&ul 'JOUl i v l\«n J. /rKDlSCE* 

NKQKTDVHY K S LDGEGNFNWRFVFPFDYLPAEQLC I VAKXEH FW 
S3DQTEFR • P?R\LII Ql W\DNDKFS \LDDYLGFPRTLTCRHTI 
HFLOKS PGGNC / RGLDMI PDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDG AS VMAG KVEMTLE I LNEKEADERPAG KGR DEPNMNP 
KLDLPl^PinsFLWFTNPCKTMKFIVWRRFKWVl 3 GLLFLLI LL 
LFVAVLLY SLPNYLSMK3 VKPNV 


6021 


4951* 


549 


EAIQFEVS 3 GKyGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPVV3LTSYWEDISHRLDAVNTLLAr4AERL0TNlEALKSGI 
CGKI PANOLAFLWLKLIDBVI EDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQ3KEAAVRKRSEATDVKSTLAEIEDWLDKLMQLTE 
EP0NSMPD1 1 3 WMIRGEKRLAYARIPAHOVLYSTSGENASGKYC 
GKTQT 3 FLKY PQE KNKGPKVPVELRVNI WLGLSAVEKKFNSFAE 
GTFTVFAEKYEKOAIjMFGKWGTSGLVGRKKFSDVTGKIKLKREF 
FLPPKGMEWEGEWIVDPERSLLTEADAGHTEFTDEVYONESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWSWEDDAWSYDINR 
AVDEKGWE YG 1 TI PPDHKPKS WVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGiWEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKOKHSA 
TTVFG ANT F J v S CN FDRDY I YHLR CYVY QARNL LALDKD S FSDP 
YAHI CFLHKS KTTEI I HSTLWP7WDQTI I FDEVEIYGEPQTVLQ 
NPPKVIMELFDI^DQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
HPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPNLYMVP 
QG IRPWQLTA IE ILAWGLRNMKNFQMAS I TSPSLWECGGERV 
ES WI KNLK KT ?N FPS SVLFKKVFLPKEELYMPPLV1 KV I DHRQ 
FGRKPWGOCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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SEQ 
ID 
NO; 


Preai cted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
secruence 


predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
!A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
K=Histidine, I=Isoleucine, K=Lysine , 
L=Leucine, M=Methionine, N=Asparaqine , 
P=Proline, C^Glut amine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
KsTryptophan, Y-Tyrosine, X^Unknown, *=3top 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion? 








Dl VI EMEDTKPLLASKCbSSMSTAuSKMASPATVHLTEKEEEl V 
DWWSKFYASSGEHEKCGOyiOKGYSKLKlYNCELEhrVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
ROFRELPDSVPOECTVRIYIVRGLELQPODimGLCDPYIKITLG 
KKVIE\DRDHYIPNTLNPVFGRKYELSCYLPQEKDI>KISVYDYD 
T FTRDE K VGETI I DLEKPF\ LS R FG \ SHCG \ I PE EY CVSG VN TK 
RDSLR\PTQ\LLQNVARFKGFPOPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERLAIHILRTQGLVPEHVETKTLHSTFOP 
K1S\RYYLRVIIWKTKDVILDEKSITGEEMSDIYVKGVJIPGNEE 
KKOKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
£IDQTEFRIPPR\LIJQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFL0KS PGGNC/RGLDMI PDLKAMI^PLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLD L PNR PE TS FLrW FTN P CKTM K F I VWRR FKVJV 1 1 GLLFLLI LL 
LFVAVLLYSLPNYLSMKIVKPKV 


6022 


4953 


549 


EA 1 0 FE VS 1 GNY GN KFDTTCKPLAS TTQY SRAV FDGN Y Y Y Y LPW 
AKTKPWTLTS YWEDI SHRLDAVNTLLAMAERLQTN IEALKSG I 
OGKI PANOLAELWLKLI DEVIEDTR YTLPLTEGKAVVTVLDTQ1 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPOKSMPD1 1 IViMIRGEKRLAYARI PAHQVLYSTSGENASGKYC 
GKT0TI FLKYPQEKNNGPKVPVELRVNI WLGLSAVEKKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYG3 TI P PDHKP KS WVAAE KMYHTHRR R R LVR KRXXD 
b'I'OTA S S TAG AM EELQDQEG WEYAS LIGWK FKWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEXSLEKQKHSA 
TTVFGANTP I VSCNFDRDY I YHLRCYVYQARNLLALDKDSFSDP 
YAHI CFLHRSKTTEIIKSTLNPTWDOTI I FDEVE I YGEPQTVIiQ 
NPFKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 
H PVMN GDKACGDVLVTAEL I LRGKDG SNLPI LPP QRA PN LYMV P 
QGI RP WQLTAI E I LAWGLRNMKNFQMAS ITS PS LWECGGER V 
ESWI KNLKXTPNFPS5VLFMKVFLPKEELYMPPLVI KVIDHRQ 
KGR KPWGQCTI ERLDRFR CDPYAGKEDI VPQLKAS LLSAPPCR 
DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLT2KEEEIV 
DWWSKFYASSGEHEKCGQYIOKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
ROFRELPDSVPOECTVRIYIVRGLELOPODNNGIjCDPYIKITIiG 

k kv i e \ d rdhy i pntln p v fgrmyelscy iipqekdlk i s vyd y d 
tftrdekvgetiidlenpf\lshfg\shcg\ipeeycvsgvntw 
rdslr\ptq\llqnvarfkgfpqpilsedgsriryggrdyslde 
feankilhqhlgapeerlalhilrtqglvpehvetrtlhstfqp 
nis\ryylrviiwntkdvildeksitgeemsdiyvkgwipgnee 
nkoktdvhyrsldgegnfnwrfvfpfdylpaeqlciyakkehfw 
sidqtefrippr\liiqiw\dndkfs\lddylgfprtltcrhti 

H FLQKS PGGNC /RG1»DMI PDLKAMNPLKAKTAS LFEQKSMKGW W 
PCYAEKDGARVKAG KVEMTLE 1 LNE KE AD ERP AG KGRDE PNMNP 
KLDLPNRPETSFLW FTNPCKTMKFI VWRR FKWVI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 




916 


S QELGMFVEIjNNLLNTTPDRAEQGKLTLLCDAKTDGS FLVHHFL 

s fylkanckvcfvaliqsfshys i vgqklgvsiitmarergqlvf 
legl/ivcsgr\vfoaqkepkplqflreanagnlkplfefvrea 
lkpvdsgearwtypvlbvddlsvllslgmgavavldfihycrat 
vcwebkgwmwlvkdsgdaedeendillnglshqshlilraegl 

UTrprDnvusnT ott inJDODcriD&iruDTv^CPTVAV^TnrufQVQirp 
nl«r CtvUVrHjVLiKJL bWKKrijyrAVMKUVo C k XU X K.xyfJVo war r 

AKGMSPAVL 


6024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQI^NELELLMEKSFWEEAE 
LPAELFOKKWASFPRTVLSTGhlDNRYLVLAVNTVQNKEGNCEK 
RLVITASQSLENKELCII»RNDWCSVPVEPGDI IHLEGDCTSDTW 
1 1 DKDFG YLI LYPDMLI SGTS IAS S I RCKRRAV1>SETFRSSDPA 
TRQMLl GTVLHEVFOKAINNSFAPEKLOELAF0TI0EI RHLKEM 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


t Amino acid segment containing signal peptide 
(A=rAlanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=IIistidine # I=Isoleucine, K=Lysine, 
L^beucine, M=Methionine, N=Asparagine, 
P=Prcline, Q=Giutair.ine , R=Arginine, 
S=Serine, T=Threonine, V-Vslme , 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








YRLNLSQDEIKQEVEDYLPSFCKWAGDFMKKNTSTDFPQMQLSL 
PSDNSKDNSTCNIEWXPKDIEESIWSPRFGLKGKIDVTVGVKI 
HRGYKTKYKIMPLELKTGKESNSIEHRSOWLYTLLSQERRADP 
EAGLLLYLKTGQMYPV?ANHLDKRELLKLRJ^O«AFSLFHRISKS 
ATR0KTQLASLPQIIEEEKTCKYCSQ1GNCALYSRAVEQQMDCS 
SVPIVMLPKIEEETQHLKQTHLEYFSLWCLKLTLESQSKDNKKN 
HQNI WLMPASEMEKSGSCIGNLI RMEHVK2 VCDGQYLHNFQCKH 
GA I P VTNLMAGDR V IVSGEERSL FALSRG Y VKE I NMTTVTCLLD 
RNLS VLP E S TL FRLDQE E KNCD 3 DT PLGN LS K LMENT F VS KKLR 
DLIIDFREP0FISYLSSVLPHDAKDTVAC1LKGLNKPQR0AMKK 
VLL S KDY TL I VGMPGTG KTTT 1 CTLVR I L Y ACG FS VLLTS YTHS 
A VDN I hh KLAKF KI G FLRS R \ Q 3 0 KVHP A 1 00 FTEHE ICRS KS 3 
KS \ LALLEELYTSQL3 DATTCMG I NHPI FSRK I FDFC I VDBASQ 
ISQPICLGPLFFSRRFVLVGDKQQLPPLVLNREARALGMSESIjF 
KRLEONKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 
NAVI NLRH FKDVKLELEFYADYS DNPWLMGVFEPNNPVCFLNTD 
KVPAPEQVEKGGVSN VTEAKLI VFLTS I FVKAGCSPSDI G 1 1 AP 
YRQQL KI INDIiIiARS I GMVE VNT VD KYQD\R DKSIVLVS FVRSN 
KDGTVGELLKDWRR1 .NVAI TRAKHKLI LLGCVPSLNCYPPLEKL 
LNHLNSEKLI I DLPSREHESLCH I LGDFORf 


6025 


3977 


89 


GGFPAOSDHLPPVFPLRSDLLITMSTLYVSFHPDAFPSLRALIA 
ARYGEAGEGPGWGGAHPRICUCPPPTSRTSFPPPRLPALEOGPG 
GLWVWG ATAVAQLLWPAGLGGPGGS RAAVLVQQWVS YADTELI P 
AACG ATLP ALGLR S S AQD P QA VLGALGRALS PLEE WLRLHT YLA 
GEAPTLADLAAVTALLLPFRY VLDPPARR 1 WNNVTRWFVTCVRQ 
PEFRAVI^EWLYSGARPIjSHOFGPEAPALPKTAAQLKKEAKKR 
EKLEKFQQKQKIQQQQPPPGEKKPKPEKREKKDPGVITYDLPTP 
PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQOGFFKPEYGRPNVS 
AANPRG VFMMC I PPPNVTGS LHLGHALTNA10DSLTRWKRMRGE 
TTLVfNPGCDHAGlATQVVVEKKLWREOGLSRHQLGREAFLQEVW 
KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 
LHEEGI I YRSTRLVMWSCTLNSA3 SDI E VDKKELTGRTLJjSVPG 
YKEKVEFGVLVSFAYKVQGSDSDEE VWATTR 3 ETMLGDVAVAV 
HPKDTRY0HLKGKNVIHPFLSRSLP1VFDEFVDMDFGTGAVKIT 
PAHDQNDYEVGQRHGLEAI S I MDSRGALI NVF PPFLGLPR FEAR 
KAVLVALKERGLFRG I EDNPMWPLCNRS KDWEPLLRPOWYVR 
CGEMAQAASAAVTRGDLRI LPERHQRTWHAKMDNI RE\WCMFPG 
KLWWG \HR\I PAY FVTVSDPAV PPG EDPDGRYWVSGRNEAEARE 
KAAKE FGVS PDKI S LQQDED V LDTW FS S GLFPLS I LGWPN0S ED 
LSVFYPGTLLETGHDIIjFFWVARi'lWiLGLKljTGRLPFREVYLHA 
1 VRDAHGRKMSKSLGNVIDPLDVI YGI SLOGLHNQLLNSNLDPS 
EVEKAKEGOKADFPAGIPECGTDALRFGLCAYMSQGRDINLDVN 
R I LGYRH FCNKLWNATKFALRGLGKGFVPS PTSQPGGHESLVDR 
W I R S RL TEA VRLSNQG FQA YD F?A VTTAQ YS FWL YELCD VYLEC 
LKPVLNGVDOVAAEC^QTLYTCLDVGLRLLSPFMPFVTEELFQ 
RLPRRKPQAPPSLCVTPYPEPSECSWKDPEAE.AALELALSITRA 
VRPVLRADYNLHPESGPTCFLEVADNEATGAIxASAVSGYVQGPG 
0 AQVWA VAE P WGL P AP \ OG CAVALASDR CS I \ HLQLQG \ LLDP 
AREU3\KLQ\AKRVEAQ\R0AQ\RLR\ERRA\ASGNPVKVPL\E 
VQEADEAKLQQTEAELRKVDEAI ALFQKML 


6026 


2674 


514 


G PI TFLKKKAKMKDMPLRIH VLLGLAITTLVQAVDKKVDCPRLC 
TCE I RPWFTPRSI YMEA3TVDCNDLGLLTFPARLPANT0I LLLQ 
TNNIAKIEYSTDFPVMLTGLDLSQNNLSSVTNINGKKMPQLLSV 
YLEENKLTELPEKCLSELSNLOELYINHNLLSTISPGAFIGLHN 
LLRLHLNSNRLQMINSKWFDALPNLEILM I GEN PI 1 RI KDMNFK 
PLINLRSLVIAGINIiTE I PDNALVGLENLES ISFYDNRLI KVPH 
VALQKWNLK FLDLNKNP INR 3 R RG D FSNMLHL KELG I NNMPEL 
I S I DS LAVDNLPDLRK I EATNNPRLS Y IHPNAFFRLPKLESLML 
NSNALSALYHGTIESLPNLKE3 S I HSNP IRCDCVI RWMNMNKTN 
IRFWEPDSLFCVDPPEFQGQNVROVHFRDMKEICLPLIAPESFP 
SNLNVEAGS YVSFHCRATA\EPO?E I YW I TPSGQKLIiPNT\LTD 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca t i en 
corresponding 
to first 
amino c.cid 
residue of 
amine acid 
sequence 


Amino acid £tcment containing signal peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, Z- 
Glutamic Acid, F=Phenylalar.ine , G=Glycine, 
H=Histidinc, I=lsoieucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, 0=Glutaniine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPYVHSEGTLDJ KGVTPKEGGLYTC3 ATNLVGADLXSVMI KVDG 
SFPQDNNGSLK j K3 RDIQANS\TLVSWKASSKILKSSVKWTAFVK 
TENSKAAQSAK 3 PSDVKVYNLTKLNPSTEYKICi DI PTI YQXNR 
KKCVNVTTKGLH PD0KEYEKKNTTTLMACLGGLLG1 IGV2 CLI S 
CLSPEMNCDGGHSyVKNYLOKPTFALGELYPPLINLWEAGKEKS 
TS LKVKATVI GL F TNMS 


6027 


5254 


4346 


GGRRAPGRPGRS ] KDEEE ET V FREWS FS PDPLP VR YYDKDTTK 
PISFYLSSLEEI.IiAWKPRLEDGFNVALEPLACRQPPLSSQRPRT 
LLCHEMMGGYLSDKFIQGSVVOTPYAFYHWQCIDVFVYFSHHTV 
T I PP VG WTNTAH R H G VCVLGT F I TEW N EGGRLCEA FLAGD ER S Y 
QAVADRLVQITXRFFRFDGWLINIENSLSLAAVGNMPPFLRYLT 
TQLHROVPGGLVLV.'YDSWQSGQLKWQDELNQHKRVFFDSCDGF 
FTNYNWREEHLE R MLGQAG ERRADVYVGVDVFARGNWGGRFDT 
DKVGGGFRPRASGPVPPIjGPHFLMDLPFPSAPQRNDSSCSSQSG 
DP VALRNRCPAR AKLCPH 


6028 


120 


3432 


NCLLLQAKGFHGE 2 E D LQG/KLT DTERK LLAS KPLGG L P ETA KEQ 
LNVHMEVCAAFEAKEETYKSLMOKGQGMIARCPKSAETWIDQDI 
NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFKNSL\QDFIN 
WLTQAEQTLNVASRFSLI LDTVLFQI DEHKVFANEVNSHR EQ I 1 
ELDXTGTHLKY FSQKQDWL3 KNLLISVQSRWEKVVQRLVERGR 
SLDDARKRAKQFHEAWSKLMEWLEESEKSLDSELEIANDPDK1K 
TQLAQHKE FQKS L£ AKHS WDTTNRTGRSLKEKTSLADDNLKLD 
DMLS 2 LRDKWDT 1 CG K S VERQN KLE EA \ LLFSGQFTDALQALl D 
WLYRVEPQLAEDOP VHGDI DLVMNL I DNHKAFQKELGKRTS SVQ 
ALKRS ARELI EG SRDDSS WVKVQMQELSTRWETVCALS 1 S KQTR 
LEAALRQAEErrfSWHALLEWLAEASC/TLRFHGVLFDDEDALRT 
LI DOHKEFMKKLEEKRAELNKATTMGDTVLAICHPDS 1 TTI KHW 
ITIIRARFEEVXiAWAKQHQORLASALAGLIAKQELLEALLAWLO 
WAETTLTDKDKEV J PQEI EEVKALI AEHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPF SLQSHI PVLDKGRAGRKRFPASSLYPSGSOT 
QIETKNPRVNLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 
NFDFDIWRKKYMRa^ K HKKSRVMDFFRRIDKDQDGK1TRQEFID 
GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 
YKPI7DADKIEDEVTR0VAKCKCAKRFQVEQ1GDNKYRFFLGNQ 
FGDSOOLRLVR1 LR S T VM VRVGGGV7MALDEFLVKNDPCRAKGRT 
NKELREKFILADGASQGMAAFRFRGRRSRPSSRGASPNRSTSVS 
SCAAQAA5PQVPATTTPK2 LHPLTRNYGKPWLTNSKMSTPCKAA 
ECSDFPVPSAEGTP2QGSKLRLPGYLSGKGFHSGEDSGLXTTAA 

arvrtqfadskktpsrpgs ragskagsrassrrgsdasdfdi se 
iqsvcsdvetvpqthrptpragsrpstAkpskiptpqrkspask 

LDKSSKR 


6029 


1 


3533 


1 mpcgssrllrgc wthpnepvsdls yfdci esvmens kvlgesm 
agisonaktgdlpafgecvgiaskalcglteaaaqaaylvgi fd 
pnsqag hqglvdp i qfaranoai qkacqnlvdpgs s psqvls aa 
tivakhtsalcnacriassktanpvakrhfvqsakevanstanl 
vktikaldgdfse dnrnkcri atapli eavenltafasnpe fvs 
ipaqissegsqacepilvsakpmlesssylirtarslainpkdp 
p twsvlaghshtv sds i kslits ir d kapgqrecdys i dg3 nrc 
i rdieqaslaavs oslatrddi s vealqeqlts wqe ighli dp 

2 ataar g eaaqlg k kgtqlas y fepl i laavgvaski ldhqqqm 
t vldqtktlaesalomlyaakegggn pkaqhthdai teaaqlmk 
eavddi mvtlneaas e vglvgghvda i aeamskldegtppe pkg 
tfvdyqttvvkyska1avta0emmtksvtnpeelgglasqmtsi? 
yghlafqgqmaaataepeej gfo i rtrvqdlghgc3 flvqkag\ 
alqvcptds ytkr eli ecaravtekvsl vlsalqagnkgtqaci 

TAATAVSGI IADLDTTI MFATAGTLNAENSETFADHRENILKTA 
KALVEDT KLL VSGAA S TPDK LAQAAQSS AATI TQLAEVVKLGAA 
SLGSDDPETQWL 1 KA I KDVAKALS DLI S ATKGAAS KPVDDPSM 
YQLKGAAKVMVTNVTS LLKTVKAVEDEATRGTRALEATI BCI KQ 
ELTVFOSKDVPEKTSS PEES 3RMTXGITMATAKAVAAGNSCR0E 
DVIATANLS RKAVSDM LTACKQAS FHPDVSDEVRTRALRFGTEC 
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BNSDOCID: <WO 01 5331 2A1 Jj> 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
and no acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G«Glycine, 
H=Histidine, 3*»3 soleucine, K=Lysine, 
L=Leucine, ^-Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLGYLDLLEHVLV1L0KPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPEDPT V I AETE LLGAAAS 1 EAAAKKLEOLKPRAK 
PXQADETLDFEEQ3LEAAKSIAAATSALVKSASAAQRELVAQGK 
VGS I PAN AADDGQWS OG L I S AAR M VAAATS S LCE AANAS VQGHA 
S EEKL1S S AKQVAASTACLLVAC KVKADODSEAMRRLQAAGNAV 
KRASDNLVRAAQK AA FG XADDDDVW KTK FVGG I AQ 3 1 AAQ E EM 
LKKiRELEEARKKLAQIRQQQYKFLPTELREDEG 


6030 


3 


1777 


FPGRGSPALQLEVL3 CLGLMGLERALNVLAPI FYRN I VNLLTEN 
APWNS LAW ? VTS YV FLKFLQGGG TGS TG F VSN LRTFLW I RVQQF 
TSRRVELLIFSHLHELSLRWHLGRRTGEVLRIADRGTSSVTGLL 
S YLVFNV3 PTLADI 1 3 GI I YFSMFFNAWFGL3 VFLCMSLYLTLT 
IWTEWRTKFRRAMKTOENATRARAVDSLLNFETVKYYNAESYE 
VERYREA1 1 KYOGLEWKSSASLVLLNQTONLVIGLGLLAGSLLC 
AYFVTEQKLQVGDYVLFGTYIIQLYMPL^WFGTYYRMIQTNFID 
MENMFDLLKK\ETEVKDLPGAGPFR?0KGR1EFENVHFSYADGR 
BTLQDVSFTVMPGQTLALVGPSGAGKST3LRLLFRFYD3SSGCI 
RIPGODISQVTOALFRFSKWELCPKDTVLFNDTIADNIRYGRVT 
AGND E VE AAAQAAG 1 HDA 1 MAF P EG Y RTO VG ERG LKLSGGEKQR 
VAIARTILKAPGIILLDEATSALDTSNERA3QASLAKVCANRTT 
IWAHRLSTWNADOILVIKDGCIVERGRHEALLSRGGVYADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 


1694 


LRMSENLDKSNVNEAGKS KSNDSEEGLEDAVEGADEAbQKAI KS 
DSS S PQRVQRPHSS PPRFVTVEELLETARGVTNMALAKE I WNG 
DFQIKPVEliPENSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
IKLVGEIKETLLSFLLPGHTRLRNQITEVLDLDLIKOEAENGAL 
DISKIiAEFIIGMMGTbCAPARDEEVKKLKDIKEIVPLFREIFSV 
LDLMKVDMANFAI S S I RFHLMQQS VEYERKKFQE I LERQPNS LD 
FVTQWLEEASEDLMTQKY KHALPVGGMAAGSGDMPRLS P VAVQN 
YAYLKLbKKDHLORPFPETVLMEOSRFHELQLQ\REOLT3LGAV 
LLVTFSMAAPGI S SQAD FAEKLKM3 VK I LLTDMHLPS FHLKD VL 
TTIGEKVCLEVSSCLSLCGSSPFTTDKETVLKGQIQAVASPDDP 
IRR3MESR3I>TFbETYLASGHQKPLPTVPGGI»SPVQRELEEVAI 
KFARL VNYNKM V FC P Y YDA I LS KI L VRS 


6032 


39 


2415 


AARLCRAQPTKSAWM3RDLSKMYPQTRHPAPHQPAQPFKFTISE 
SCDRIKEEFQFLOAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLN 1 EMH KQAE I VKRLNA 3 CAQV 3 P FLSQEHQQQ WQAVERAK 
QVTMAELNAIIGQQQLCAOHLSHGHGLPVPLTPHPSGliQPPAIP 
'P3GSSAGLLALSSALGGOSHLPIKDEKKHHDNDHQRDRDS3KSS 
SVS P S AS FRG AEKH RNSAD Y S S ESKKQKTEEKE 3 AAR YDSDGEK 
SDDNL WD VSNEDPSSPRG S PAHSPR ENG bDKTRLLKKDAP ISP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPG VDFLAS SLRTPMAVPCPY PTP FGI VPHAG 
MNGELTSPGAAYAGLHN 3 SPQMSAAAAAAAAAAAYGRS PWGFD 
PHHKMRVPAI PPNLTG 3 PGGKPAYSFHVSADGQMQPV P FPPDAL 
I GPG I PRHARQ 3 NTLNHG E WCAVTI SNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCLNRDNYIRSCRLLPDGRTLIVGGEASTL 
S 3 WDLAAPTPR I KAELTE£APACYAIiA3 S PDSKVCFS CCSDGNI 
AVWDLHNQTLVRQFQGHTDGASCI DI SNDGTKLWTGGLDNTVRS 
W\DLREGRQLQQHD/FFTSFVFSLGYCP\TSEWLAVGMENSN\V 
EVLSVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTG3CDNLLNA 
W\RTPYG\ASIF\0SKESSS\VLSCD3\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAQPTKSAWM I RDLS KMYPQTRHPAPHQPAQPFKFTI SE 
S CDR I KEE FQFLQAQ YHS LKLECEKLAS E KTEMQRH Y VM Y YEMS 
YGLNIEMHK0AEIVKRLNAICAQVIPFLS0EK00QWOAVERAK 
QVTMAELNAI IGOOQbOAOHLSHGHGLPVPLTPHPSGLOPPAI P 
P IGSSAGLLALSSAbGGOS HLP3 KDEKKHHDNDKQRDR DS IKS S 
S VS PSAS FRGAEKKRN S AD Y S S E S KKQKTEE ICS I AAR Y DS DGE K 
SDDNLWDVSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAPISP 
AS I ASSSSTPSSKS KELSL-NEKSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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BNSDOCID: <WCL_ ( 153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
cor re spondinc 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D-Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidinc, I*=l3oleucine , K=I»ysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proiine, 0-Glutamine, R=Arginine. 
S -Serine, T~Threonine, V= valine, 
v:=Tryptophan , Y=Tyrosine, X-Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTSPGAAYAGLHNISPQMSAAAAAAAAAAAYGRSPVVGFD 
P HHI 1MR V P Al P P N LTG I PGGKP AY S FHVS AD3QMQ P V P FP P DAL 
I GPG1 PRHAR01KTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCLNRDNYIRSCRLLPDGRTLIVGGEASTL 
S I WELAAPTPR I KAELTSSAPACYALAISPDSKVCFSCCSDGNI 
A VWDLHNQTLVRQFQGHTDGASCI DI SNDGTKLWTGGLDNTVR S 
W\DLREGRQLQOHD/FFTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVLHVTKPDKYOLHLHESCVLSLKFAHCGKWF\VSTGKDNLUNA 
K\RTPYG\ASIF\QSKHSSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


E S GRR RR LKRRRS PC PGTAGG PGETN PG PGACPRG PREEAAAAM 
E I APOEAP P VPGADGD IEEAP AEAGSP S PASPPADGRLKAAAKR 
VTF PSDED I VSGAVE PKDPWRHAQNVTVDE VI GAYKQACQKLNC 
R0IPKLLRQLQEFTDU5HRLDCLDLKGEKLDYKTCEALEEVFKR 
L0FKWDLE0TNLDEDGASALFDMIEYYESATHLNISFNKHIG7 
RGWQAAAHMMRKTSCLQYL\DARNTPLLDHSAPFVARALRIRSS 
LAVLHLENASLSGRPbMLLATALKMNMNLRELYL\ADNKLNGLO 
DSAOLGNLLKFNCSLQILDLRNKHVLDSGXiAYICEGLKEQRKGL 

RHLKNGLlSNRSVLRbGLASTKLTCEGAVAVAEFIAESPRXLRL 
DLRENEI KTGGLMALSLALKVNHSLLRLDLDREPKKEAVKSFI E 

QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPR5GPPLPNGLKPEFALALFPEP 
P PG P EVKGGS CGLEHELS CS KNE KEL EELL1>E AS QESGQETL 


6035 


IS 


404 


SVTYLGI I LH KNTGAL PAD P VQL I SQTPT P S TKQQLLS FLGM VG 
YFYLWIPGFAILTKPLCKXiTKENliADAIDPKSFSHSSFRSLKTA 
LENASTLALPDSSQPF\SLHTAEVQGCWE1LT0GLGPLPV 


6036 


1745 


356 


LPDVEKLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
SRGGOGRGVEKPPHLAAblLARGGSKGIPLKNIKHIiAGVPLIGW 
VLRAALDSGAFQSVWVSTDHDEIENVAKQFGAOVHRRSSEVSKD 
SSTSLDAI I EFLNYHNEVDI VGNIOATSPCLHPTDLQKVAEMIR 

WDGELYENGSFYFAKRHLIEMGYLOGGKMAYYEMRAEHSVDIDV 
E 3 D W P I AEQ R VLR YGY FGKEKLKE I KLIjVCN I DG CLTNGH I YVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVSVSDKLAWDEWRKEMGLCWKEVAYLGNEVSDEECLK 
RVGLS GAPADACSTAQKAVGYICKCNGGRGA\ I REFAEHI C\ LL 
MEKGLINFNPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTS W WMSSVLTI LLFS1/QGNKMLNYS APSAGGYLLPRKPVGTPA 
GGGFPRRHSVTLPSSKFRQKOLLSSLKGEPAPALSSRDSRFRDR 
S FS EGGER LLPTQKQPGGG Q VNS SR YKT\ ELCRP FEENG ACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFCPYGPRCHFI 
Ht^EERRALAGARDLSADRPRLQHS FS FAGFFS AAATAAATGLL 
DSPTS I TP PPI LS ADDUiGSPTLPDGTNNPF\AFS SQELASLFA 
PSMGLPGGGSPTTFLFRPMSES PHMFDS PPSPQDSLSDQEG YLS 
SSSSSHSGSDS PTLDNSRRLP I FSRLS I SDD 


6038 


1450 


426 


SSALQEFGTRNHTFGVPLPHR RKQI I S CNICQLR FNSDSQAAAH 
YKGTKHAKKLKALEAMKNKQKSVTAKDSAKTTFTSITTNTINTS 
SPKTBGTAGTPA1STTTTVEJRKSSVMTTEITSKVEKSPTTATG 
NSSCPSTETEEBKAKRLL\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLEARNGSGT3KAFPRAGVKGKGPVNKGNTGL0NKTFHCEICD 
VH\^ T SET0LK0H1SSRRHKDRAAGKPPKPKYSPYNKL0KTAHPL. 
GVKLVFSKEPSKPLAPRILPNPLAAAAAAAAVAVSSPFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY 


6039 


4073 


1000 


LDEYEARLTLAKXDDFEEDNEDDDENRVNQEEKAAKITELINKL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
RKTEDSFYNNSYNPFKEVQTPQYLNPFDEPEAFVTIKDSPPQST 
KXKNIRPVDMSKYLYADSSKTEEEELDESNPFYEPKSTPPPNNL 
VNPVOELETERRVKRKAPAPPVLSPKTGVLNENTVSAGKDLSTS 
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BNSDOCID: <WO 0153312A1_L> 



WO 01/53312 
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SEQ 
ID 
WO: 


Predi ctec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue oi 
amino acic 
sequence 


Amino acic segment containing signal peptide 
(A^Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, ]>Isoleucine, KsLysine, 
L=Leucine, M=Methionine, K=Asparagine, 
P=Pro3ine, G=Glutamine, R=Arginine, 
S=Serine, J= Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, ^Unknown, * =S top 
Ccdon, /=posEible nucleotide deletion, 
\=possibie nucleotide insertion) 








PKPSPIPSPVLGRKPNASQSLiiVWCKEVTKNyRGVKI'J'NFTTSW 
FJ3GLSFCA3LKKFRPDLIDYKSLNPQD3KENNKKAYDGFAS3GI 
SRLhEPSVMVllJKlPDKhTym'YhYQlRAHFSGQthirr^QIEZN 
SSKSTY KVGNY ETDTNS SVDQEK FYAELSDLXREPE LQQ PISGA 
VDFLSQDDSVFVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
SDTEPQKSQQSSGRTSGSDDPGICSNTDSTOAQVLLGKKRLLKA 
ETLE^SDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENSRSLECRSDPESPIKKTSLSPTSKLGYSYSRDLDLAXKKHA 
SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
LEQARRDAALKAGNKHNTNTAAPFCNRCLSDQODEERRROLRER 
AROLIAEARSGGKMSELPSYGERAAEKLXERSKASGDENDNIEI 
DTNE E I PEG FWGGGDELTNLENDLDTPEQNS KL VDLKL KKLLE 
VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 
RK P WFS KDS TVR KTQLQS FSQ Y 3 ENRPEM KRQRS 3 QEDTKKGN 
EE KAAI TETQ RK ?S EDEVLNKG FKDS \SQYWGELAALENEQKQ 
IDTRAALVEKRLRYLMDTGRNTEEEEAJ^IQEWFMLVKKKNALIR 
RKNQLSLLEKEHDLERRYELLNRELRAMLAIEDWQKTEAQKRRE 
OLLLDELVALVW KRDAIiVK DI .DAQEKQAEEEDEHLERTLEQNKG 
KMAKKEEKCVLC 


6040 


475 


1052 


PTALMTA PS CAF P VQFRQPS VSGLSQ I TKSLY 3 SNG VAANNKLM 
LS SNQI TMV3 MVS VEWNTLYED I QYMQVP VADS PNSRLCDFFD 
P3 ADHIHS VEMKQGR \ TLLHCAAGVSRS AALCLAYLMKYHAMSL 
LDAKTWTKSCR P 1 I RPNSGFWEQL 3HYEFQLFGKNTVHM VSS PV 
GMIPDIYEKEVRLM3PL 


6041 


2 


3886 


TEKX)EKTAHNLENVLIHFWERLSE3CVAKISEPEADVESVLGVS 
NLLCVLOKPKGSLKSSKKKNGKVRFADEILESNKENEKCVSSEG 
EK3ECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLAD1SINY 
VN ER KS E QHLR F L S TLLDS FS S S R VFKMLLGCE KQS I VQA KPLE 
3AKLVQKNPAV0FLYQKLIGWLNED0RKDFGFLVPILYSALRCC 
DNDMERKKVLDDLTKVDLKWNS LLK1 1 EKACPS SDKHALVTPWL 
KGD I LG E KL VIsDoADCIjCNEDLE S R VS S ES H FSER WTL.J. »S L VLS Q 
HVKNDYLIGDVYVERIIVRLHETLFKTKKLSEAESSDSSVSFIC 
DV A YN Y F S S AKGCLLM P SS EDLLLTLFQLCAQS KE KTHLP D FL I 
CKLKNTWl.SGVNLLVHQTDSSYKESTFLHLSALKLKNOVOASSL 
D I NSLQVLLSAVDDLLNTLLES E DSYLMGVY 3 GS VM PNDSE WEK 
MR0SLPMQWLKRFLLEGRLSLNYECFKTDFKE0D1KTLPSKLCT 
SALLSKMVLIALR KETVLENNELEKI IAELLYSLQWCEELDNPP 
3FLIGFCE3LQKKN3TYDNLRVLGNMSGLLQLLFNRSREHGTLW 
SLI I AKLlLSRSi S S DEVKPHY KRKESF FP LTEGNLHT 3 QSLCP 
FbSKEEKKEFSAOCI PALLGWTKKDLCSTNGGFGHIAI FNSCLQ 
TKS IDDGELLHG3 LKI 3 ISWKKEHEDIFLFSCNLSEASPEVLGV 
m EI IRFLSLFLKYCSSPLAESEWDFIMCSMLAWLETTSENQAL 
YS I PLVQLFACVS CDlACDLSAFFDSTTIiDTIGNLPVWL I SEWK 
EFFSQG IKS LLLP 3 LV7VTGENKDVSETS FQNAMLKPMCETLT Y 
3 SKEQLLSHKiPARLVAI^KTl^PEYliOTLl^TIAPLLLFRARP 
VQ3 AVYHiViLYKLMPELPQYDQDN^KS YGDE EEEPALS PPAALMS 
LLS I QEDLLENVLGC3 P VGQI VT3 KPLSEDFCYVLGYLLTWKLI 
LTFFKAASSOLRALYSMYLRKTKSI^KLLYHLFRLMPENPTYAE 
TA\ f EV?NKDPKTFFTEELQLS3RETTMLPYHIPHIiACSVYHMTL 
KDLPAMVRLWWNSSEKRVFNIVDRFTSKYVSSVLSFQE3SSVQT 
STQLFKGMTVKARATTREVMATYTIEDI VI ELI 3QLPSNY PLGS 
I I VESGKRVGVAVQQWRNWMLQLSTYLTHQNGS I MEGLALWKNN 
VDKRFEGVEDCM3CFSV3 HGFNYS LP KKACRTCKKKFKS A \ CLY 
KW FTS SNKS TCS LCRETFF 


6042 


1306 


253 


MAELAPASPSDIKASVSNGDTTLLCSRRQSCGMNEVRQVSLTYP 
GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAOSRGY 
GAOPJ^PGGLSYP/iASPTPHAAFLADPVSNMAMAYGSSLAAQGKB 
LVDKN IDRFI P 3 TKLKYYFAVDTKYVGRKLGLLFFP YLHODWEV 
QYCODTPVAPRFDVNAPDLYIPAMAFITYVLVAGLAI^TQDRFS 
PDLLGLQASSAIiAWLTLEVLAI LLSLYLVTVNTDLTT3 DLVAFL 
GYKYVGMIGGVLKGLLFGKIGYYLVLGWCCVAIFVFM3RTLRLK 
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BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
WO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first, 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Iscleucine , K=bysine, 
L=L»eucine, M=Methionine , N=Asparagine, 
P^proline, Q=Glutamine, R=Arginine, 
S=3erine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=posfiible nucleotide insertion) 








IliADAAAEGVPVRGARNQLRMYLTKAVAAAOPMLMYWLTFHLVR 


6043 


40j 


599 


LCLFFPFPCATPVLPLPSL1SAL/CLSHLSVSSWFCPC0PPLPC 
PLPPLGNKTAKGSLSTEQSERG 


6044 


793 


412 


KLEMWNFTL I S KVKI S RE VTMI AS KFG IGQQVRHSLLG Y LGVW 
DIDP VYS LS E PSPDELAVNDELRAAPWYHWMEDDNGLP VHTYL 
AEAOLSSELQDEHP\EQPSMDELAQTIRKQIiQAPRLRN 


6045 


155 


2299 


SPliPQVAAMNYLRRRLSDSKFMANLPKGYMTDLORPQPPPPPPG 
AH S PG AT PG PG TATAE RS S G VAPAAS PAAPS PGSSGGGG F FS S L 
S N A VKQTTAAAAATFS EQ VGGG SGG AG RGGAASR VLLVI DE PHT 
D WA K YFKG K K I HGE I DI K VEQAE FS DLNLVAHANGGFS VDME Vh 
RNGVKWRSLKPDFVLIRQHAFSI4ARNGDYRS1jVIGL0YAGIPS 
VNSLHSVYNFCDKPWVFAQHVRLHKKbGTEEFPLIDOTFYPNKK 
EMi»S S \TTY PWVKMGHGTLWGWGK-\mVDNOHDFCDl AS WALT 
KT Y ATAE P F I D AK YDVR VQK1 GON Y KAYMRTS VSGNW KTN TGS A 
MLEOIAMSDRYKLWVDTCSEIFGGLDICAVEALHGKDGRDHIIE 
WG S S MPL I GDHQD ED KQL. I VELWMKMAQ ALPRQRQR DAS PGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRQGPPLQORPPPOGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
ASOAAPPTQGQGRQSRPVAGGPGAPPAARPPASPSPQROAGPPQ 
ATR0TS VSG P AP PKAS G APPGGQQRQGP PQ KP PG PAG P T RQ ASQ 
AGPVPRTGPP TTQQPR PS G PGP AGR ? KPQLAQKP SQDV P P PATA 
AAGGPPHPOLNKSQSLTNAFNIiPEPAPPRPSLSODEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRWAGPESLPPLPR 
SLIMDS PRAG THQGP LDAETEVGADR CTSTAYQEQR PQ V EQVGK 
CAPLS PGLPAMGGPGPGP CEDPAG AGGAGAGGSE PLVTVTVQCA 
FTVALRARRGADLSSLRALLGQALPHOXAOLGOLSYLAPGEDGH 
WVP 1 PEEESLQRAWQDAAACPRGLOLQCRGAGGRPVLYOWAQH 
SYSAOGPEDLGFRQGDTVDVLCEVDCAWLEGHCDGRIG I FPKCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


PVLVTSLRMREADTIAPPOLMEVSADIISTVEFNHTGELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
I EEKI NKI KWLPQQNAAHS LLSTNDKTI KLWK I TERDKR PEG YN 
LKDE EGKLKDLSTVTSLQVP VLKPMDLMVEVS PRRI FAKGHTYH 
INS3SVNSDCETYMSADD1jRINLWHLAITDRSFTP\NIVDIKPA 
NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYb 
tvkvwdiAnme ARPIETYQVHDYLR s klcslyendci FDKFECA 
WNGSDSV I MTGA\ YNNFFRMFDRKTKRDVTLNEASRESSKPRAV 
LKPRRVCVGGXRRRDDISTOSLDFTKKILHTAWHPAENIIAIAA 
TNNLYIFQDKVNSDMH 


6048 


1 


: 3194 


G1RTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 
KGTS N S S KTRAGANS KGRRGSQNS S EHRP PASSTS EDVKAS PS S 
ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
EPT VLDRKCPS PVLI DCPH PNCNKKYKHINGLKYHQAHAHTDDD 
SKPEADGDSEYGEEPlLHADLGSCNGXASVSQKXGSLSPARSAr 
PKVRLVnEPHSPSPSSKFSTKGLCKKKLSGEGDTDLGALSNDGSD 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 
SARPI/APLAIPPQQIYTFQTATFTAASPGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 
LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGLL.NGSSDPHQSR 
LASIKAEADKIYSFTDNAPSPSIGGSSRLENTTPTQPLTPLHW 
T0NGAEASSVKTNSPAYSD3SDAGEDGEGKVDSVKSKDAEQLVK 
EGAKKTLFPPQPQSKDSPYYOGFESYYSPSYAOSSPGALNPSSQ 
AGVESOALKTKRDEEPESI EGKVKND1 CEEKKPELSSSSCXJPSV 
IQORPKMYMQSLYYNQYAYVPPYGYSDQSYHTHLLSTNTAYRQQ 
YEEQQKRQS LEQQQRG VDKKAEMG LKEREAALKEEW KQ KPS I P P 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVI I PKLDDS SKLPGQ 
AP EGLKVKLSDASHLSKEAS EAKTG AECGRQAEMDP I LWYRQEA 
EPRMWTYVYPAKYSDIKSEDERWKEERDRKL.KEERSRSKDSVPK 



436 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^phenylalanine, G=Glycine, 
H*=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M*Methionine, N-Asparagine, 
P=Proline, Q-Glut amine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codcn, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY 
IPYMKGySYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSYSFSP 
YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 
S PTI SDKTSQERDRGGCGVVGGGGSCS S VGGASGGERSVDRPRT 
SPSQRLMSTHHHHHHLGYSLLPAQYNLPYAAGLSSTAIVASQQG 
STPSLYPPPRR 


6049 


215 


1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQESPTLPESSAT 
DS3YYSP7GGAPKGYCSPTSASYG\KALNPYQYQYHGVNGSAGS 
YPAKAYADYSYASSYHQYGGAYNRVPSA7NQPEKEVTEPEVRMV 
NGKPKKVRKPRTIYSSFQLAALQRRFQKTQYLALPERAELAASL 
GLTQTOVKIMFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
OSPAVWEPQGSSRSLSHHPHAHPPTSNQSPASSYLENSASWYTS 
AASSINSHLPPPGSLQHPLALASGTLY 


6050 


566 


1718 


KGLSRTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRSI I FANYIARDTRRLGATI LD 
RIHSLQINSS LST YS LVDS VGNTKTFD V EHSHVR FLGNLVLNLW 
DCGGQDTFMENYFTSQRDNI FRNVEVL1 Y VFDVESRELEKDMHY 
Y0SCLEA3 LQNSPDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS I WDETLYKAWSSIVYQLI PNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLS CS KLAAS FQSME VRNSNFAAFI D I FTSNTYVMWMSDPS I 
PSAATLINIRNARKKFEKLERVDGPKQCLLMR 


6051 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLG?RMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYI ARDTRRLGATILD 
KIHSLQINSSLSTYSLVDSVGNTKTFOVEHSHVRFLGNLVLNLW 
DCGGODTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CLEAI LQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
R LS R PLECS C FRTS I WDETL YXAWS S I VY QL I PNVQQLEMNLRN 
FAE3 IEADEVLLFERATFLVISHYQCXEQRDAHRFEKISNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAF I DI FTSNTYVMWMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6 052 


566 


1718 


KGLERTCCAMEESDSEKTTEKEKLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FANYI ARDTRRLGATILD 
R 3 H S LQ I NS S LST YSL VDS VGNTKT FD VEHS HVR FLGNLVLN LW 
DCGGQDTFMEN YFTSQRDN I FRNVEVLI YVFDVESRELEKDMHY 
YQSCLEAI LQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTS I WDETLYKAWSS I V YQLI PNVQQLEMNLRN 
FAEI IEADEVLLFERATFLVISHYQCKEORDAHRFEKISNI IKQ 
FKLSCSKLAAS FQSMEVRNSNFAAFI DI FTSNTYVMWMSDPS I 
PSAATLI NI RNAR KHFEKLER VEGPKQCLLMR 


6053 


201 


1704 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDSEDRSDSRAAQPA 
HDSGKGDDESPSTSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 
HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLGFLNVTN YCHLAKELRLSCMER KKVQIRSMDP S ALASD 
RFNLI LADTNSDRLFTVNDVTVGGSKYG 1 1 NLQS LKTPTLKVFM 
HEWLYFTNRKV\NSVCWASLNHLDSHILLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TGLSRR VLLTNWTGHRQS FGTNSDVLAQQ FALMAPLLFNGCRS 
GEIFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYLMASD 
MAGKI KLWDLRTT KC VRQY EGH VNEYA Y L ? LHVHE EEG 1 LVAVG 
QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 
APGLLMAVGQDLYCYSYS 


6054 


1 


1054 


PPIARLQEFGTSRRKMAAPSGVHLLVRRGSHRIFSSPLNHIYLH 
KOSSSQQRRNFFFRRQRDISHSIVLPAAVSSAHPVPKHIKKPDY 
V n hi VFL/WtaUbi h. VKNbJJQI QGLHQACQLAkHVLLLAGKSLKV 
DMTTEE I D ALVHRE 1 1 SHNAYPS PLG YGG FPKS VCTS VNNVLCH 
GI PDSRPLQDGDI INI D VTVYYNG Y HGDTS ETFL VGNVDECG KK 
LVEVARRCRDEAIAACRAGAPFSVIGNTISHITHQNGFQVCPHF 
VGHGIGSYFHGHPEIWHHANDSDLPMEEGMAFTIEPIITEGSPE 
FKVLEDAWTWS LD/TSKVS AQFEHTVLI TS RGAQI LTKLPHEA 
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SEQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 

T r~\ r* zz f" i r~i n 
JLCCa tlOli 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenylalanine, G*=Glycine, 
P— H i sh i dine l = Isoleucine K»l.v«jinp 
L=Leucine, M=Methionine, N^Asparagine, 
P*Proline, Q=Glutamine, R=Axginine, 
S=Serine, T=Threonine, V=Valine, 
K=Tryptophan, Y=Tyrosine, X=Unknovm, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


605S 


423 


2364 


PPY FLLSFLAWWLYGQS DRTETD I SQSAGPPPGTJjQCSALHKD P 
GCANCSRFCRDCSPPACOCHTHVFPGNALKGVQPPELSRTLAIjI 
SSREPPRKKKKSQTETGKERERTSFLTQGGKRFELQHGLAGICM 
TLblTGDS I VS AEAVWDHVTKANRELAFKAGDVIK VLDASNKDW 
K WGQ IDDEEG W FPAS F VRLWVN H E DE VEEG PS D VQNGHLDPNS D 
CLCLGRPLQNRDQMRANVINEI MSTERKY 1 KHLKDICEGYLKQC 
KKRR DMFS DEQLKVI FGNI EDI YR FOMGFVRDLEKQYNNDDPKL 
£ E I GPCFLEHQDGFW1 YSEY CNNHLDACMELSKLMKDSRYOHFF 

SDYRYVAAA1AVMRNVTQQI NER KRRliENl DK IAQWQAS VLDWE 
G E D ~ I>DR S S ELI YTGEMAW I YQ P \ YGRNQORV F FLFDHQM VLCK 
KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
N0KRQAAMTVRKVPKOKGVNSARSVPPSYPPPODPLKHG0YLVP 
\DG1AQSQVFEFTEPKRSQSPFWQNFSRLTPFKK 


6056 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
bAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLAYMESKGAHRAGLAKVIPPKEWKPROCYDDIDNLLIPAPIOQ 
MV TGQSGLFTQYNI QKKAMTVKEFRQLANSGKYCTPRYLDY KD1» 
ERKY WKNLTFVAP1 YGADI NGS1 YDEGVDEWNIARLN7VLDWE 
EECGISI EGVNTP Y LY PGMWKTTFAWHTEDMDLYS INYLHFGEP 
KSWYAI ?PEHGKRLERLAQGFFP£SS0GCDAFLRHKMTL1 SPSV 
LKKYG1 PFDK1TQEAGE FMI TFP YG YHAGFNHGFNCAESTNFAT 
VR W I DYGKVAKLCTCR KDMVK I SMD I FVRKFQPDR YQLWKOG KD 
I YT1 DHTKPTPASTPEVKAWLORRRKVRKASRS FOCARSTS KR P 
KADEEEE VSDE VDGAE VPNPDS VTDDLKVS EKS EAAVKLRN TEA 
SSEEESSASRMQVEQNLSDHIKLSGNSCLSTSVTEDIKTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEXSDPSELSWPKSPESCS 
S V AE SNG VLTEGEES DVES HGN GLE PGEI PAVPSGERNSFKV PS 
I AEGENKTSKSWRHPbSRPPARS PMTLVKQOAPSDEELPEVLS 1 
EEEVEETESWAKPl^lKLWOTKPPNFAAEQEYNATVARMKPHCAI 
CTLLMPYHKPDSSNEENDARWETKLDEWTSEGKTKPLIPEMCF 
I YSEEKI EYSPPNAFLE EDGTS LLI SCAKCCVRVHASCYG I PSH 
E 3 CDGWLCARCKRNAWTAECCLCNLRGGALKQTKNNKWAm^MCA 

UlslfmwrD C w PM\/r>CT5'P/^>T T~\\Tf > 'D T DTYM5T.VT V C T tVP WD\7TTB V Q fllv 

V/WFUVKr lTJVFfc»K.iyi JJ»v»Kl rJjvKlJlvljRCJ.rCI\njXVi\i\ vow/* 
C J QC S YG RCPAS FHVTCAKAAG VIA M E PDDWPY WN I TCFR H KV 
NPNVKSKACEKV1SVG0TV1TKHRNTRYYSCRVMAVTSQTFYEV 
MFDDGSFSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKY FGSN I AHMYQVE FEDGS 01 AMKREDI YTLDEELPKRV KARF 
VSAGRCHLGTCOVNSLS SPHVS0AQ0ETYLGFWINSKKSQCN1 F 
LSGTY 


bUb / 


1 


B53 


GSUYLVICGQDDGPPGSEDPERDDHEGQPRPRVPRKRGHISPKS 
RPWANSTLLGJjLAPPGEAWG J LGQPPNRPNHSPPPSAKVKX I FG 
WGDFYSNIKTVALNLLVTGKIVDHGNGTFSVHFQHNATGQGN1S 
ISLVPPSKAVEFHQEQ01FI EAKASKI FNC\RMEWEKVE\RGRR 
TSLFTHDPAKI CSRDHAOSSATWS CSQPFKWCVYI AF YSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6058 


1 


986 


HPLPSASLGLPSVSLGVSbCVRSALLEAWPMLPKRRRARVGSP 
SGDAASSTPPSTRFPGVAI YLVE PRMGRSRRAFLTGIARSKGFR 
VLDACSSEATHWMEETSAEEAVSWQERRWAAAPPGCTPPALLD 
3 S VJLTES LGAGQPV PVECRHRLE VAGPS KGPLSPAWMPAYACQR 
P7PLTHHNTGLSEALE I LAEAAGFEGSEGRLLTFCRAASVLKAL 
PS P VTTLSQLQGLPHFGEBSSRVVQEbLEHGVCEEVERVRRSE / 
RLFTQIFGVGVKTADRWYREGLRTl>DDIiREQPQKLTQOQKAGEP 
S R E AGPWAS LN CTLDP S AS TP 


6059 


2 


3650 


QODFSSLADLTDHRAHRCPGDGDDDPQLSWVASSPfSKDVASPT 
QmGDGCDLGLGEEEGGTGhPYPCQFCDKSFlRLSVLKRHEQIU 
S DKL2FKCT YCSRLFKH KRS R DRH I KLHTGDK K YHCHE CEAAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLOSHMQAHKKNK 
EHLAXS EKEAKKDDFMCDYCEDTFSQTEELEKHVLTRHPQLS EK 
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SEQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
con espor.ding 
to tirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide \ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=IsoIeucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=UnXnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVFVDENTLLAH3H0AHANQKHI<CPMCPE\QFSSV 
\EGVY CHLDSHRQPDS SNHS VS PDPVLGSVASMSSATPDSSASV 
ERGSTPDS TLKPLRGQKKMR DDGQGWTKWYS CPYCS KRDFNSL 
AVLE IHLKT I>1ADKPQQSHTCQ 1 CLDSMPTLYNLNEHVRKLHKN 
HAYPVMQ FGNI S AFHCNYCPEMFAD I NSLQEH I RVSHCGPNAKP 
SDGIWAFFCNQCSMGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 
VQPTQSFMEVYSCPYCTNSP3 FGS ILKLTKHI KENHKNIPLAKS 
KKSKAEQSPVSSDVEVSSPKRQRLSASANS3SNGEYPCNQCDLK 
FSNFESFQTHLXLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVXJD\LQKH\LLDMPHPLCCTHCT\L 
CQEVFDS\KVSI \QVHLAVKH£NEKKMYRCTACNWDFRKEADLQ 
VHVKHSHLGNPAKAHKCIFCX3ETFSTEVELQCHITTKSKKYNCK 
FCS KAFHAI I LLE KHLREKHCVFDAATENGTANGVPPMATKKAE 
PADLQGMLLKNPEAPNSHEASE DDVDAS EPMYGCDICGAAYTME 
VLLQNHRLRDHN I RPGEDDGSR KKAE FI KGSHKCNVCSRTFFSE 
NGLREHLQTHRGPAKHYMCPICGERFPSLLTLTEHKVTHSKSLD 
TGTCR I CKMPLOSEEE P T FH fCMH PDLRNSL^GFRCWCMOTVT 
STLELKIHGTFHMQKLAGSSAASSPNGQGLQKLYKCALCLKEFR 
SKQDLVKLD VNG L PYGLCAGCMAR S ANGQVGG LAPPEP ADR PCA 
GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQC1 KCOMTFENERE3 Q3 HVANHMI EEGINHECKLCNQM 
FDS PAKLLCHL I EHSFEGMGGTFKCPVCFTVFVQANKLQQHI FA 
VHGQEDKI YDCS QCPQKFF FQTELQWHTMS QHAQ 


6060 


2145 


202 


S YE I VGKNKLEVNHSQLKALCKCS LPSRLLPLGENLPLLDRG FR 
KEPRS RGSRERDNMLHLHH S CLCFRS WLPAMLAVLLSLAPSASS 
DI SASRPNILLLMADDLG I GDIGCYGNNTMRTPN I DRLAEDG VK 
LTQK I S AASLCT PS RAAFLTG RYPVR SGMVSS 3 G YR VLQWTGAS 
GGLPTNETTFAKI LEEKGYATGLI GKWHLGLNCESASDHCHHPL 
HHG FDH F YGMP FS LMGDCAR WE L£ E KR VNLEQKLN FLFQVLAI»V 
ALTLVAGKLTHLI PVSWMPVI WSALSAVLLLASS YFVGAL1 VHA 
DCFLMRNHTITEQPMCFQRTTPLIIiQEVASFLKRNKHGPPLLFV 
FLHVH T PTiT TMFN FT /3 K Q 1 ,Uf?T .YfiDTJV K FMDWM VGR I LDTLDV 
EGLSNSTLIYFTSDHGGSLENQLGNTQYGGWKGIYKGGKGMGGW 
EGG 1 RVPGI FRWPGVLPAGRVI GEPTSLMDVFPT WRLAGSEVP 
QDRVlDGQDLLPLIiLGTAOHSDHEFLMHYCERFLHAARWHQRDR 

gtmwkvhfvtpvfqpegagacygrkvcpcfgekwhhdppllfd 
lsrdpsethiltpasepvfyovmer\vooavwehqrtlspvplq 
ldria3n3wrpwlqpccgpfplcwclreddpq 


6061 


110 


1330 


MN3 HMKRKTI KK3NTFENRMLMLDGMPAVRVKTELLES EQGS PN 
VHNYPDMEAVPLLLNKVKGEPPEDSLSVDHFQTQTEPVDLSINK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRIASSPTVITS 
VSSAS S SSTVLTPGPLVASASGVGGQQFLH 1 1HPVPPS 5PMNLQ 
SNKLSHVHRIPVWQSVPWYTAVRSPGNVNNT1WPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTAJ^SIA 
RAVQEYHPS PVSRVRGNRMNNQKFPCS 3 SPFSIESTRRQRTVLN 
P PDSR KTA Y STDCDF\ EGLQQKL YTKSS S PGRVHRRTHTGE KPY 
KCTWEGCTWKFARSDELTRH YR KHTGVKPFKCADCDRS FSRSDH 
LALHRRRHMLV 


6062 


71 


107S 


ETMAKNGPI^CEXJCHILNAEAFKSKKICKSLKICGLVFGILALT 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYME1DPVTR 
TE3FRSGNGTDETLEVHDFKNGYTGI YFVGLQKCFIKTQI KVI P 
E FS E PEEE I DENEE I TTTFF EQS V I WVP AEKP I ENRDFLKNS KI 
LE I CDNVTMYW \ INPTL\ I SGT FAKQLHHNFAFI ILVS ELQDFE 
EEGEDLHFPANEKKG I EQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENG1EFDPMLDERGYCCIYCRRGNRYCRRVCEPLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNGPENCEDCHIIJNAEAFKSKKICKSLKICGLVFGILAJL.T 
L I VLFWGS KH FW PEVP KKAYDMEHTFYSNGEK KK.I YM E 1 D ?VTR 
TE3FRSGNGTDETLEVHDFKNGYTGIYFVGLQKCFIKTQIKVIP 
EFSEPEEE3DENEEITTTFFEQSV3WVPAEKPIENRDFLKNSKI 
LB 3 CDNVTMY W\ INPTL\ 3 SGTFAKQLHHNFAF I ILVS ELQDFE 
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SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleoti de- 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, ^phenylalanine, G»Glycine, 
HsHistidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, C=Glutamine, R=Arginine, 
S=Serine, T^Threonj ne, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *s=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEGEDLHFPANEKKGIEQNEQWWPCVKVEKTRHARQASEEELP 
INDYTENGlEFDFMLDERGyCCIYCRRGNRYCRRVCEPLUGYYP 
YPYCYOGGRVICRVI MPCNWWVARMLGRV 


6064 


913 


311 


flLPQSLPRPTEHSPPYSLEKMTDLVAVWDVALSDGVHKIEFEHG 
TTSGKR WYVDGK E E I R KEWMFKLVGKETFYVGAAKTKATIN1D 
AISGFAYEYTLE1NGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LEKDAKDVWCNGKKLETAGEFVDDGTETHFS IGTH\ACYI KAV\ 
SSG\KRKEGIIHTLIVDNREIPEIAS 


6065 


1153 


641 


NSVRVARVAWVRGLGASYRRGASSFPVPPPGAOGVAELLRDATG 
AEEE APWAATERR M PGQCSVLLFPGQGSQWGKGRGLLNY PRVR 
ELYAAARRVLG YDLLE LSLHG PQETLDRT VHCQ PAI FVAS LAAV 
EKLHHLQ P S VI ENCVAAAGF £ VGEFAALVFAGAMEFAEG 


6066 




3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW 
EDLDDDDP KF INVGEKAY SCALKSGKLVTAVSNNTI 0VHTFPEG 

vpdgjltrfttnanhvvfngdgtkiaagssdXflvkivdvmdss 

00KT? RGKDAP VIjSLSFDPKDI FLASASCDGSVRVWQI SDQTCA 
ISWPLLQKCNDVI NAKS I CRLAWQPXSGKLLAI PVEKS VKLYRR 
ESWSHQFDLSDNF2 S QTLN I VTWS PCGQYLAAG S I NGL I IVWNV 
ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 
CDPSGXTSSSKVSSRVEKDYNDLFDGDDMSNAGDFLNDKAVEIP 
SFSKG I I NDDEDDEDLMMASGRPRQRSHILEDDENS VD I SMLKT 
GSSLLKEEEEDGOEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRFMVWNS 1GI IRCYNDEQDNAIDVEFHDTS I HHAT 
HLSNTLNYTIADLSHEAILLACESTDELASKLHCLHFSSWDSSK 
EKI IDLPQNEDI EA I C1MQG WAAAATSALLLRLFTIGGVQKEVF 
SLAG P VVSMAGHGEOLFI VYHRGTGFDGDQCLG VQLLELGKKKK 
QILHGDPLPLTR KS YLAWI GFSAEGTPCYVDS EG I VRMLNRGLG 
NTWTPI CNTREHCKGKSDHYWWGI HENPQQLRCI PCKGSRFPP 
TLPRPAVAILSFKLPYCOIATEKGQMEEQFVJRSVIFHNHLDYLA 
KNGYEYEESTKNQATKEQQELLMKMLALSCKLEREFRCVELADL 
MT0NAVNL AI KYAS RSRKL I LAQKLSELAVEKAAELTATOVEEE 
EEEEDFR XKLNAG YSNTATEWSQPR FRWQVEEDAEDSGEADDEE 
KPEIHKPGONSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 
SKEPAMSMMSARSTNILDNMGKSSKKSTALSRTTNNEKSPIIKP 
LIPKPKPKOASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPCNTENQRPKTGr QMWLEENRSNI LSDNPDFSDEADI I KEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


6067 


858 


321 


L?WQRI>GVLIiSRGKJ^WGWLESU2TAQKTALLQDGRRKVHYLF 
PDGKEMAEEYDEKTSELLVRKWRVKSALGAMGQWQLEVGDPAPL 
GAGNLG PEL I KESNANPI FMR KDTKMS FQWRI RNLP YPXDVYS V 
SVDQKERCI I VRTTNKKYYKKFS 1 PDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 


GS KMADLANEEKPA I AP PVFVFQKDKGQKSPAEQKNLSDSGEEP 
RGE AEAPK HG TGK PES AGEHALE P PAPAGASAS TP P P P APEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSOSEERSSGFRLKPPTLIHGOAPSAGLPSQKPKEQORSVLR 
PAVLQAPOPKALSOTVPSSGTNGVSLPADCTGAVPAASPDTAAW 
RS P S EAADE VCALEE KE PQKN ES SNASE E EACE KKDPATQQA FV 
FGQNLRDRVKLINESVD3ADMENAGHPSADTPTATWYFL0YISS 
SLEN3TNSADASSKKFVFGQNMSERVLSPPKLNEVSSDANRENA 
AAESGSESSSQEATPEKESLAESAAAYTKATARKCLLEKVEVIT 
GEEAESNVLOMQCKLFV FDKTSQSWVERGRGLLRLNDMAS TDDG 
TLQSR LSDAGPRGSLR\LILNTKLWAQMQIDKAS EK\ S I R I TAM 
DNEDQG VK VFLISASS KDTGQVyAALHHR 1 LALRS RVEQEQEAK 
M PAP E PG AAP SNEEDDS DDDDVLAPSGATAAGAGDEGDGQTTGS 
T 


6069 


583- 


27 


PTRPGQAGS SSAMAAORLGKRVLSKLQS PSRARGPGGS PGGLQK 
RKARVTVKYDRRELORRLDVEKWIDGRLEELYRGMEADMPDEIN 
IDELLELESEEERSRKIC^LLKSCGKPVEDPIQELIAKLOGLHR 



440 



BNSDOCID: <WO 0153312A1J_> 
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SEQ 
ID 
NO: 


Predict ec 
beginning 
nucleotide 

1 /^/-•a t" "i r\ri 
XCJCa L. -i on 

corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Predicted end 

nucleotide 

location 

cor x c bpuiiui y 

to first 

Ami nn ar*^d 

residue of 
amino acid 
sequence 


Amino acic segment containing signal peptide 
lAsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 

L'wUi ' T — T C/ll oi ini nr> V.T t r n ins 

n-niiciuiiic, i-i&oieucinc, j\.=Liysine, 
L=Leucine, M=Methionine , N=Asparagine, 

PcPfftl S OcGllltami np B~Hvni n i n*^ 

* - * I \J J. J. llw i V*— ■* <- 1 *• OllUltC i rv — r\L U _L XI X JIG , 

S=Serine, "^Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q\ PG LRQPS PS ? \ DGQPSAPFQG PGARTAS PLTLLALFPGP PER 
RPALLCVLSC3 


6070 


476 


858 


I RVTVDGEFLH Y 1 FPLQFbDS FEW/ RFTETHRGRHFXQVTLTAE 
TDCRYVSWRRKKLYLLFAQHRYISRLFSVL1GSDIADKLYALND 
RVY3GKRYKYDIRLPNFYQMSTPEIRRSPLTQHFQNSRRYW 


6971 


2 


1654 


HEARTKGNKAliARP\VRLFSLVTRLLLAPRRGLTVRSPDEPLPV 
VR1 PVALQRQLEQRQSRRRNLPRPVLVRPGPLLVSARR PELNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFS1ERAQQEAPAVRKLS 
SKGS FAD LG AW K ? R VLHALQE \AAP E WQ\ PTTVQSST1 PSLLR 
GRHYVCAAETGSGKTLSYLLPLLQRLLG\HPSLDSLPIFAPRGL 
VLVP SR EbAOO VRAVAQPLGRSLGLLVRDbEGGHGMRR I RLQLS 

D/">no T\ rM it \ i 7\ T n r- 7\ t mvht vrrir tn fz-tT **c?t t rr tm?htvht 7 T"\"~>r> 

r b AD V Jb V A 1 ALW KAL»KS RI* J S 1»E Q hS r LVX»DEADTijijD^b> 
FLELVDY1LEKSHIAEGPADLEDPFNPKAQLVLVGATFPEGVGQ 
LLNKVAS PDAVTTI TS S KLHC I MPHVXQTFLR LKG AD KV AEL VH 
I LKHR DRAERTGPSGTVLVFCNSSSTVNWIiGY I LDDHKIOHLRL 
OGOMPALMRVGIPQSFQKSSRDlLLCTDIASRGliDSTGVELVVN 
YDFPPTLODYIHRAGRVGRVGSEVPGTVISFVTHPWDVSIiVOKI 
ELAARRRR5I>PGliASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSOLEFKSKPFPLVSSSRWLVKRGELTAYVEDT 
V L»r S R RTS KOU V Y F tr LiFNDVIjI I TKKKS EES YNVwDYSIjKDQL»Lj 

vescdne elns s pg knsstmlysrqs sashlftltvlsnhanek 
vemllgaetqserarw i talghssgkppadrtsltqve ivrsft 
akqpdelslqvadwli\yqrvsdgwyeger\lrdgergkfpme 

CAKEITCQATIDKNVERMGRLIjGLETNV 


6073 


620 


860 


PCRRGLARPLSRRPG/SILVHCAVGVSRSATLVLAYLMIiYHHLT 
LVEAI KKVKDHRGI I PNRGKJbRQLJ.iALDRRLRQGLEA 


6074 


168 


1110 


pgarci^atelqcpdsmpchnqqvnsastpspeolrpgdlildha 

GGNRAS RA KV ILLTG YAHS SL PAELDS GACGGS S LNS EGNSGSG 

dsssydapagnsfledcelsrqigaolkllpmndoirelqtiir 
dktas rg dfmfs adrlirlweeglnql p YKECMVTTPTGYKYE 

GVKFEKGNCGVS I MRSGEAMEQGLRDCCRS IRIGKILIQSDEET 
QRAKVYYAKFPPDIYRRKVLLMYPILQTG\NTV1EAVKVL1EHG 
V0PSVI I LbSLFSTPHGAXS I IQEFPEITI LTTEVHPVAPTHFG 
OKYrGTD 


6075 


3 20 


1091 


P?TCOPOEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT 
KLGPE I ERAECTI RMfJDAPTTGYSADVGNKTTYRWAH SS VFRV 
LRRPOEFVNRTE ETYF IFWGP PS KMQKPQGSLVRV I QRAGLVFP 
NKEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDKVHVYGM VP PN YCS QRPRLQRMP YHY YEP KG PDECVTY I 
ON EHS R KGNHHR F I TE KR VPS S W AQLYG ITFSHPSWT 


6076 


1721 


107 


H PSPTEAPR VOHLTMDCTWRI LFLVAAATGTHAQVQLVQSGAEV 
KXPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAFDPED 
G ET I YAQKFQG R VTM TEDTSTDTA YMELSSLR S EDTAVY Y CATD 
HGDYAFD I WGQGTM VTVSSAPTKAPDVFPI I SGCRHP KDNSPW 
IiACLITGYH?TSV\TVTMYMGTQSQA\QRTFPEIQRRDSYYMTS 
S0LSTPLQOWROGEYKCWQHTASKSKKEIFRWPESPKAOASSV 

TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 
AHLTWEVAGKVPTGGVEEGLLERHSNGSQSQHSRLTLPRSLWNA 
GTSVTCTLNHPSLPPQRLMALREPAAQAPVKLSLNLLASSDPPE 
A \ AS WLL CE VSG FS PPN I LLMWLEDHGE VNTSG FAPARPL? KP \ 
RSTTFWA\WSVLRVPAPPSPQPATYTCWSHEDSRTLLNASRSL 
E VS YVTDHGPMK 


6077 


3687 


1268 


IiLPDMNLQ P I F W I G L I S S VCCVFAQTDENRC LKANAKS CG EC I Q 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 
GS KD I KKNKNVTNRS KGTAEKLKPEDITQI Q PQQLVLRLRSGEP 
QTFTLKFKRJ^DYPIDLYYI^\DLSYSMKDDLENVKSLGTDLMN 
EMRRI TSDFR 1GFGS FVEKTVNPYI STTPAKLRNPCTSEQNCTS 
PFS YKNVLS LTNKGE VFNELVGKQR I SGNLDS PEGGFDAI MQVA 
VCGS LI G WRNVTR LL VFS TDAGFHFAGDGKIX5G I VLPNDGOCHL 



44] 



BNSDOCID: <WO 015331 2A1_I_> 



WO 01/533J2 



PCT/US00/34263 



SEO 
ID 

NO: 


Predicted 
beginninc 
nucl eotiae 
1 oca t ion 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Acpartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine , K=Lysine, 
L=Lsucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Vaiine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EIWMYTMSHYYDYPSIAHLVQKLSENN1QTI FAVTEEFQPVYKE " 
LKNLIFKSAVGTLSANSSNVIQLIIDAYNSLSSEV1LENGKLSE 
GVT I S YQS Y \ CKNG VNG7GENGRKCSN 2 S IGDEVQFEI S I TSNK 
CPKKDSDSFKIRPLGFTEEVEVILQYJCECECOSEGIPESPKCH 
EGNGTFECGACRCNEGRVGRHCECSTDEVWSEDIGCFTARKENQ 
FOKSASNHGRVPSAGQCVCRKRDNTNEIYSGKFCECDNFNCDRS 
NGbl CGGNGVCKCRVCECNPNYTGSACDCSLDTSTCEASNGQI C 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYSVNGNNEVMVHWENPECPTGPDIIP1VAGWAC 
I VL I GliALbL I W KLLM 1 1 HDRRE FAKFE KE KMN AKWDTG ENP I Y 
KS AVTTWN P KYEGK 


6078 




180 


ETEDVMELLEEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGILE 
GSVRNSLWRPVPFKCPTCRKKTFSYWELlPLQVNYSIiKGIVEKY 
NKI X I S PKMP VCKGH \ LGQPLNI F\ Ch\ TDMQLDL/CG I C\ATR 
GEHTKHVFCS IEDAYAQERDAFESLFQS FETWRRGDALSRLDTL 
ETSKRKSLQLLTKDSDKVKEFFEKbQHTLDOKKNEILSDFETMK 
LAVM0AYD P E INKLNTI LQEQRMA FNI AEAFKDVSEP I V FLQQM 
0EFREKI KVI KETPLPPSNLPASPLMKNFDTSOWED I KLVDVDK 
LShPQDTCTFI SKI PWSFYKLFLL I LLLGLVI VFGPTMFLEWSL 
FDDLATWKGCLSNFSSYLTKTADFI EQSVFY WEQVTDGFFI FNE 
R FKNFTLWLNNVAEFVCKYKLL 


607S 


I58f 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCRNLQEFLGGbSP 
GVLDRLYGHPATCLAVFREIjPSl^KNWVMRMLFliEQPLPOAAVA 
LWKKEFSXAQEESTGLLSGLRIWHTQLLPGGLQGL1LNPIFRQ 
NLRIALbGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEVVIi 
HFMVGSPSAAVSODLAOLLSOAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKD 
YSVEGMSDSLLNFLOHLREFGLVFQRKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIALFSE 
MLYPFP\NMW\ARVTR\ESVQQAIASGITAQQIIHFLRTRAHP 
VKIiKQTPVLPPTITDQIHLWEIiERDRLRFTEGVLYNQFLSQVDF 
ELL\ LAHAPKLG VLVFE /NT P AKRLM WTP AG H S DV KR FWKRQK 
HSS 


6080 


I 


1199 


I ET I DH VGEFAMAAOAAGVSRORAATQGLGSNONALKYI^ODFK 
TLRQQCLDSGVLFKDPEFPACPSALG YXDLGPGS PQTQG 1 1 WKR 
PTELCPSPQFIVGGATRTDICQGGLGDCWLLAAIASLTIiNEELL 
YR W PRDQDFQENYAGI FH FQPLCPPS? \FW0YGEWVEWI DDR 
LPTKNGQLLFLHS EQGNE FWS ALL E KAYAKLNG C YEALAGGSTV 
EGFE DFTGGI SEFYDLKKPPANLYOI IRKALCAGSLLGCSIDVY 
S AAE AEAI TSQKLVKSHAYS VTGVEE VNFOGHPEKLI RLRNPWG 
EVEKSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 
QFSRLEICNLSPDSLSSEEVHKWNLVLFNGHWTRGSTAGGCQNY 
?GSS 


6oea 


3 


865 


EMLP LLLPLPLLWA/GALAQDARFRLEM PESVTVQEGLCI FVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDOETPVATNNSTQKVOKB 
TOGRFHLLGDPSRNNCSLSIRDARRRDNGSYFFWVARGRTKFSY 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFS WMSAAPTSLGPRTLHSSVLTI J PRPQDHGTNLI CQVTFP 
GAGVTTERTIQLSVSWKSGTVEEWVLAVGWAVKILLLCLCbl 
I LSFKKKKAVRAVEVEENVYAVMG 


6082 


283 


1288 


EARSPGPTQTRTAPGLAAPGLAQPAALRLLLSRPPSAAMDGDGD 
PESVGOPEEASPEEOPEEASAEEERPEDQOEEEAAAAA\Y\LDE 
LPEPLLA/LRVLAALPRHE\LVQACR\LVCLRWKELVDGAPLWL 

EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQVIDLQAEGYWEELLDTTOPAIWKBWYSGRSDAGCLYELTV 
KLLSEHENVLAEFSSGQVAVPQDSDGGGWMEISHTFTDYGPGVR 
FVRFEHGGQDSVYWKGWFGARVTNSSWfVEP 


6083 


1865 


309 


KOWCAERRGLGMSLADELLADLEEAAEEEEGGSYGEEEEEPAIE | 



442 



BNSDOCID: <WO 01S3312A1_L> 



WO 01/53332 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
locst ion 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D=Aspartic Acid, E = 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H^Histidine ~ = Isole\icine» K=Lv<*-ine 
L=Leucine, M=Kethionine, N^Asparacine , 
P^Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DV0EET0LDLSGDSVKTIAKLWDSKMFAEIKMK1EEYISKQAKA 
SEVMG P VEAAPEYRVI VD7SNNLTVE I ENELN 1 1 HKFI RDKYSKR 
FPELESLVPNALDY I RTVKELGNSLDK CKNN ENLQQI LTNAT I M 
VVSVTASTTQGOOLS EEELERLEEACPMALELN ASKHRI YE YVE 
SRK^FTAPNTiSl 1 IGASTAAKTMGVAGGIiTMT.^KMPAPNTMT.T CZ 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RR KAAR IiVAA KCTLAAR VDS FH ES T EG K VG Y E LKDE I E R KFD KW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
JuMRMSFGElEEDAYQEDLGFSLGHLGKSGSGRVRQTOVNEATKA 
R I S KTLQRTLQ KQ£ WYGGKST I R DR S SGT AS S V AFT PLQGLE I 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGIjMST 


6084 


186E 


209 


KQWCAERRGLGMSIJuOEU^LEEAAEEEEGGSYGEEEEEPAlE 
DVQE ETQLDLSGDS VKTI AKLWDS KM FAE I MMKI EEYIS KQAKA 
SEVMGPVEAAPEYRV1VDANNLTVEI EKELN I IHKFI RDKYSKR 
FPELESLVPNALDY IRTVKELGNS LD KCKNNENLQQ1 LTNAT I M 
WSVTASTTQGQQLSEEEL.ERLEEACDMALELNASKHRIYEYVE 
c K M£> rlf\JfNLbili VjAo I AAKI Mlj V A w> i> J J>* LS KMFACN I MLIiG 

aqrktlsgfsstsvlphtgyiyhsdivqslppipppfsvapXdl 
rrkaarlvaakctlaarvdsfhes tegkvg y elkdei erk fdkw 
cepppvkqvkplipapldgqrkkrggrryrkmkerlglteirxkq 
anrms fge 1 eepayqedlg fslgh lg ksgsgr vrqtovneatka 
risktlqrtlqkqswyggxstirdrssgtassvaftploglei 
vnpqaaekkvaeanqkyfssmaeflkvkgeksglmst 


6085 


2 


1456 


SGPRSFQGNRAVGRISLGGKRNPEVTbLPGVSSERVRRWRRARV 
GVARVKPGNPWKPSPATQVPR/VPAOVYLPGRGPPLREGEELVM 
DEEAYVLYHRAQTGAPCLSFDIVRDHLGDNRTELPLTLYLCAGT 
OAESAOSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPOLELAMVPHYGGINRVRVSWLGEEPVAGVWSEKGQVEVFAbR 
RIjLQVViEPQALAAFI/RDEQACWKP I FSF AGHMGEGFALDWSPR 
VTGRLLTGDCQKN I HLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 
S P TENT VFAS CSADASI R I WD I RAA PS KACMLTTATAHDGD V3>7V 
I S WSRREPFLLSGGDDGALKI WDLRQ FKSGS PVATFKQHVAPVT 
SVEWHPODSGVFAASGADHQITQWDLG/IVERDPEAGDVEADPG 
LADL PQQLLFVHOG ETELKELHWH P 0 C PG LhVS TALS G FT I FRT 
ISV 


6086 


2419 


1357 


GAATOHGGAKNLLPCNPHGNGLLYAGFNODHGCFACGMEKGFRV 
YMTDPLKEKEKQE FLEGG VGHVEMLFRCN YLALVGGGKKFKY PP 

y* t\.VP f iS WULJ J-UV*\J\. 1 v I £,±k.r £? 1 fc, vIvAv KJjKK \UKJ V V VL»L/orlJL JV.V 

FTFTKNP\HOLHVFE\TCYNPKGLCVLC?NSKrKSLLAFPGTHTG 
HVQLVDLASTEKP PVDI PAHEGVLSC3 ALNLOGTR I ATASEKGT 
LIRIFDTSSGHLIQELRRGSQAANIYCIKFNODASLICVSSDHG 
TVHI FAAEDPKRNKOSSLASAS FLPKY FSS KWSFSKFQ VPSGS P 
CI CAFGTEPNAVIAI CADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL 


6087 


476 


1877 


ONSQRTGLPITIFSRSFPLLTGSDLCENMPCTCTWRNWRQWIRP 
bVAVI YLVS I WAVPLCVWELQKLEVG I HTKAWFI AGI FLLLTI 
PISLWVILQHLVHYTQPELQKPI IRILWMVPI YSLDSWIALKYP 
GIAIYVDTCRECYEAYVIYNFMGFLTNYLTNRYPKLVLILEAKD 
OOKKFPPLrrr"PPWAMGEVT.I.FRrKT/=VT,OYTVVRPFTTTVAT.T 
CELLGIYDEGNFSFSNAWTYLVIINNKSOLFA.VJYCLLLFYKVLK 
EELSPIQPVGKFLCVKLWFVSFWQAWlALL'vnCVGVISEKHTW 
EWQTVEAVATGLODFIICIEMFLAAIA\HHYTFSYKPYVQEABE 
GS CFDSFLAMWDVSDI RDDI SEQVR HVG R TVRGH PR KKLFPEDQ 
DONEHTSLLSSSSODAISIASSMPPSPMGHYOGFGETVTPQTTP 
TTAKISDEILSDTIGEKXEP5DKSVDS 


6088 


1684 


689 


gasglvrLlqqghrcllapvapklvppvrgvkkgfraafrfqke 

LERORLLRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 
KTAFVNS CYIKSEEAKRQOLG I EKEAVLLNLKSNQELSEOGTSF 
S QTCLTQ FLEDE Y PDM PTEG I KbDJ\TOFLTGEEWCHVARNLAVE 
OLTLSEEFPVPPAVLOQTFFAVIGALLCSSGPERTALFIRDFLI 
TOMTGKELFEMWKIINPMGLLVEELKKRNVSAPESRLTRQSG\A 



443 



BNSOOCID: <WO 01S3312A1J,> 



WO 03/53312 



PCT/US00/34263 



SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue cf 
amino acid 
sequence 


Predicted end 
nucleotide 
locatior. 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, GrGlycine, 
H=Hictidine, i=isoJeucine, K=Lysa.ne, 
L=Leucine, M=Methionine, N=Asparacine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVLVAEEEAARVALRKLYGF 
TENRRPWNYSKPKETLRAEKSITAS 


6089 


3 


30S4 


TRLGI PGSTI SSR PRLCALAAEGHFLGH S WTGSRAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQS LVKHSSG I KGSLPLQK 
LHLVSRSIYHSHHPTLKLORPQLRTSFQQFSSLTNLPLRKLKFS 
P I KY GYQPRRNFW PAR LATRLLKLRYLI LGS AVGGG YTAXKTFD 
QWKDM1 PPLSEYKW I V?DI\WE I DEYIDFEKI RKALPSSEDLVK 
LA PD FDK I VES LSLL KD FFTSGS PEETAFRATDRGS ESDXH FRK 
VS DKEKIDQLQEELLHTQLK YQR ILERLE KENKELR KLVIjQKDD 
KGIPFIESLRKSLIDMYSEVLDVLSDYDASnOTDHLPRWWG 
DQSAGKTS VLEMIAQARI FPRGSGEMMTRS PVKVTLSEGPHHVA 
LFKDSSREFDLTKEEDLAALRHEIELRKRKNVKEGCTVSPETIS 
LN VKGPGLQRMVLVDLPGVI NTVTSGMAPDTKETI FS I SKAYKQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AE KNVAS PSRI QQl I EGKLFPMKALGYFAVVTGKGNSSES I EAI 
REYEEEFFQNSKLLKTSMLKAKQVTTRNLSLAVSDCFWKMVRES 
VEQQADSFKATRFNLE?EWiO*KYPRLRELDRNELFEKAKNEILD 
EVISLSOVTPKHWEEILOOSLWERVSTHVIENIYLPAAQTMNSG 
TFNTTVDIKLKQWTDKOLPNKAVEVAWETLOEEFSRFMTEPKGK 
EHDDI FDKLKEAVKEESI KRHKWNDFAEDSLRVIQHNALEDRSI 
SDKQQWDAAI YFMEEALQARLKDTENAI ENMVGPD\WKKRWLYW 
KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVEVDPSL I KOTWKQVYRRHFLKTALNHCNLCRRGFY YYQRH 
FVDS ELECNDWLFWR 1 QRMLAITANTLRQQLTNTEVRRLEKNV 
KEVLEDFAEDGEXK1KLLTGKRVQ1AEDLKKVREI0EKLDAFIE 
ALHQEX 


6090 


194 


1560 


PVFVPAPGAVLEQAS/ AS PPLATQT WPLQKCKI PELPVQAS IL 
FELQLFFCQLIALFVHYINIYKTVWWYPPSHPPSHTSLNFHIilD 
FNLLMVTTI VLGRRFIGS I VKEASQRGKVSLFRS I LLFLTRFXV 
LTATGWSLCR5LlHLFRTYSFLNLL/FPLLSVWDVHSVPAAELiR 
P\R KTS LFNKMASMGPR E A VSGLAK S RD YLLTLR \ RRGSS TQDS 
CMARTPCP /PHACCLS PS L I RSEVEFLKMDFKWRMKE VLVSSML 
SAYWAFVPWFVKNTHYYDKRWSCELFLtiVSlSTSVlLKQHLL 
PASYCDLLHKAAAHLGCWOKVDPALCSNVLOHPWTEECMKPQGV 
bVKKSKNVYKAVGHYNVAlPSDVSHFRFHFFFSKPLRILNILLL 
LEGAVIVYOIjYSLMSSEKWHQTI SLALI LFSNYYAFFKLLRDRL 
VLGKAYSYSASPQRDLDKRFS 


6091 


3279 


412 


SSRTREMSEKEILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 
PS DF PADHAVR PLHGARGGQP P VPQQH VLERQVQLSQGQNVVT K 
VKPPS KSGSASAS GAQRGS LE E r EDT P WSDQRPRE GEGEPPRGQ 
liQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRIUjGSHSVASCAPQ 
LLGDRRVDAGHTDQP VP SG SVGGPARP ASGPRQAREAS LWTCR 
TNKFRKNNYKWVAASSKSPRVARRALS PRVAAENV CXASAGMAN 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKWKASSPSASSSS 
SFRWQSEAGSKDHASQLSPVLSRSPSGD\RPAVGHSGLKPLSGE 
TP LS A Y KVKSRTKI I RR RGS T S LPGDKKS GTS PAATAKSHLSLR 
RRQALRGKSSPVLKKTPNKGLVQVTTHRLCRLPPSRAHIiPrKEA 
SSLHA VRTAPTSKVI KTtf YR 1 VKKTPASPLSAPPFPLSLPSWRA 
RRLS LS RS LVLNRLRP VASGGG KAQ PGS P WWRS KGYRC I GG VL Y 
KVSANKLSKTSGQPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 
RSLAI IRQARQRREKRKEYCMYYNRFGRCNRGERCPYI HDPEKV 
AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGICSNSN 

ACPRGAOCQLLHRTOKRHSRRAATSPAPGPSDATARSRVSASHG 
PRXPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSS SPPASLDHEAPSLQEAALAAACSNRLCKLPS FI SLQS 
S PS PGAQPRVRAPRAPLTKDSGKPLHI KPRL 


6092 


143 


3190 


A KAP P TG E S S E P E AKV LHT KR L Y RAWEA VI I RLD LI LCN KT A Y Q 
EVFKPEN J SLRNKLRELCVKLM FLHPVDYGRKAEELLWRKVYYE 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
K=Histidine, I=Isoleucine , K=Lysine, 

Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=:Tryptophan, Y*= Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








V1QLI KXNFKKH I HSRSTLECAYRTHLVAG IGFYQHLLLY IQSH Y 
0LELQCCIDWTHVTDPL1 GCKKPVSASGKEMDWAOMACHRCLVY 
LGDLSRYQNELAGVDTELLAERFYYQALSVAPQIGMPFNQ1GTL 
AGS KY YNVEAM YCYLR C I OS EVS FEGAYGNLKRL YDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSLASEDEEEYESGYAFLPD1* 
L1FQMVIICLMCVHSLERAGSKQYSAAIAFTLALFSHLVNHVNI 
RUJAELEEGENPVPAFCSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQ VG EGRKSRKFSRLSC LRRRRH PP K VGDDS DLS EGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRS PTL E PPRGR £ E A PDSLNG PLG PS EAS I ASNLQAMS TQ 
MFQTKRCFRlAPTFSNLLliQPTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERSIQEKLQVLMAEGLLPAVKVF 
LDWLRTNPDL1 IVCAQSSQSLWNRLSVLLNLLPAAGELQESGLA 
LCPEVQDLLEG CELPDL PS S LLL PE DMALRNLP P LRAAHRR FNF 
DTDRPLLST LE E SWR 1CCIRSFGHFI ARLQG S I LQFNPE VG I F 
VSIAQSEQESLLQQAQAQFRMAQEEARRNRLMRDMAQLRLQLEV 
SQL.EGSLQQP KAQS AMS F YLVPDTOALCHHLP VI RQLATSGRFI 
VIIPRTV1DGLDLLKKEHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQDADAWTLYKILDSCKQLT\LAQGAGEEDPSG 
MVTI I TGLPLDNPSLLSGPMQAALQAAAHASVDI KNVLDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLALR VAR T / S R WGAL \ RG A VW APGTR PS KRRA C WALL 
PPVPCC1jGC1aAERWRLRPA7UjGLRLPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWG PASTPSLYENPWTI PNMLSMTRIGLAP 
VLGYLIIEEDFNIALGVFALAGLTDLLDGFIARNMANQRSALGS 
ALDPIiADK I L I S I LYVS LT YADLI P VPLTYM 1 1 S RDVML I AAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFISKVNTAVQLILVA 
AS LAA P VFN Y ADS I YLQ 1 LW CFTAFTTAASAY S Y YH YGR KTVQV 
IKD 


6094 


23 


1010 


PFLRCLRGDOKA KJMS ER K V1»N K Y Y P PDFDPS KIP KLKL PKDRQ Y 
WR LMAP FNMR CKTCG E Y I Y KG KK FNARKET VQNE VY LGL P I FR 
FYIKCTRCLAEITFKTDPENTDYTMEHGATRNFQAEKLLEEEEK 
R VQ KE REDE ELNNPMKVLENRTKDS KLEMEVLENLQE L KDLNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLWVKKAKADPDCSNGQPOA/APHPRSPAEQEG 
GQP YTPDAWR VLPEPTGCI PGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFOPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSODFVGEKLGSGEPSHS 
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NO: 
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beginning 
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location 
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to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cvsteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Kistidine, I=Isoleucine, lULysine, 
L=Leucine, M=Methionine f N-Asparagine , 
P=Proline, Q=Glutamine, R«=Arginine, 
S«Serine, T=Threonine, V=Valine, 
VI -Tryptophan , Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHT V PKPGXGADLS K P PCR KAKE I RKER KRLKLMQQN PAG EL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDL2FESLPENASHKLEVR 
WRSSPPSSQFKATLLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FLCS S PLE AETP PNGPD CG YG S FHQQ YWLDGK I I A VG V I D I L P N 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLHEKTSOLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPIEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRP.IMPYGVYKKOOKDPS 
EEAAVLQYASLVGQKCS ERMLLFRN 


6096 


2277 


575 


QR VRAALLS S AMEDSEALG FEHMGLDPRLLQAVTDLGWS R PTL I 
QEKAI PLALEG KDLLARAR TGSG KTAAYA I PMLQLLLH R KATG P 
WEQAVRGLVLVPTKELARQAQSMIQQIATYCIARDVRVA^TVSAA 
EDSVSORAVLMEKPDWVGTPSRILSHLCQDSLXLRDSLELLW 

LI LHN P VTL KLQESOL PGPDQLQQFQ WCETE EDKFLLL Y ALL K 
LSLI RGKSLLFVNTLE RSY RLRLFLEQFS I PTCVLNGELP LRSR 
CHI I SQPNQGFYDCV1 ATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARAJWPGIV 
LTFVLPTEQFHLGK1 EELLSGENRGPILLFYQFRMEEI EGFRYR 
CRD AMRS VTKQAI REARLK E I KEELLHSEKLKTYFEDN PR \ DLQ 
LLRHDL P LH PA WKPH LGH V PDYLVP PALRGLVRPHK K \ G R S CL 
PLVGRPREQS P RTH CAASS TKERKS DP QPS PPE WGPLW S 


6097 


1673 


192 


APGTMSGGKKKSSFQITSVTTDYEGPGSPGASDPPTPQPFTGPP 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRWKLFHGLGEP 
YRRGRWTCVDVYERDLEPHSFGGLLEGIRGASGGAGGRSLDSRL 
ELASIjGIjOAPTPPSGLSOGPTfaWLRPPPTSPCjPyAKSr IGGLGQ 
L WPS KAXAEKPPLSAS S PQQRPPEPETGES AGTSRAAT P LPS L 
RVEAEAGGSGARTPPLSRRKAVtWRLRl'lELGAPEEMGQVPPLDS 
RPSSPALYFTHDASLVH KS PDPFGAVAAQKFSLAHSMLA 1 SGHL 
DSDDDSGSGSLVG I DNK I EQAMDLVKSHLMFAVREEVEVLKEQI 
RELAERNAALEQENGLLRALAN SPEQLGSAGPPRGVPR\LG P PA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 

QPPPSLPGTPQQ 


6098 


168 


1C74 


HYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSN1NPRQTETSVNASRSPEKCAO0ROK 
RLNSAS QRS SS LP PSNR KS S TP TfCRE 3 MLTP VTVAYS P KR S PKE 
I«JSPGFSHLLSKNESSPIRFDILLDDLDTV?VSTLQRTNFRKOL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVfiYFOCKPVSVTPOGMDFKYTAXIRTLAETERFF\D 
ELTKEKDQ I E AALS RMP S PGGR I TLQTRLNQEAFGRS FGKI' 


6099 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLLIKKQRETSDTPIMRALKELD 
EGKIFKNWGTQTEKEDTSN3NPRQTETSVNASRSPEKCA00RQK 
RLN5ASQRS SSLPPSNR KSSTPTKREI MLTPV1VAYSPXR S PKE 
NLSPGFSHLLSKNES SPl'RFDI LLDDLDT VP VS TLQRTNPR KQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWSKNKSVSYEOCXPVSVTP0GNDFEYTAKIRTLAETERFF\D 
ELTKEKDQ I EAALSRMPS PGGR ITLQTRLNQEAFGRS FGKD 


6100 


2 


713 


FVEV3GYRSRADPEPRGRDTMTYAYLFKYI I IGDTGVGKSCLLL 
QFTD KR FQP VHDLT I G VE FGARMVN I DGKQ I KLQ1 WDTAG 0ES F 
RS ITRSYYRGAAGALLVYDITRRETFNHLTS WLEDARQHS S SNM 
V1MLIGKKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAFINTAKEIYRKIOQGLFDVHNEANGIKIGPQQSISTSVGP 
SASQRXSRD IGSNSGCC 


5101 


1 


1399 


FRGRAWPLREVSHWLGCR RVCSWS AS WGRLPALS ARLS PLLAFR 
G KMV FPLSC AVQQY AWGKMGSNS E VARLLAS S D PLAQ I AE D KP Y 
AELWMGTHPRGDAKI LDNRI SQKTLSQWIAENQDSLGSKVKDTF 
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ID 
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beg inning 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 

l 5 — ZV 1 an i tip r=CvQhPinp 1*>— 21 tns ** t" "i c Rri F— 

Glutamic Acid, ?=Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LsOeucine, M=Methionine, N=Asparagine , 
P=?roline, 0=Glutamine, R^Arcinine, 
S^Serine, T= Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








NGNLPFLFKVLSVETPLSIQAHPNKEIAEKLHLQAPOHYPDANH 
KPEMAIALT? FCGLCG FRP VEE I VTFLKKV F E FQFL I GDE AATH 
LKQTMSHDS QA VASS LOSCFSHLMKSEKKWVEQLNLLVKR ISQ 
OAAAGNNMEDI FGELLLQLHQQYPGDIGCFAI YFLNLLTLKPGE 
AMFLEANVPHAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSS KDRLFLPTRSCEDP YLSI YDP PVPDFTI MKA\ EVP 
G\ S VTEY KDLALDS AS 1 LLKVQGTVI AST PTTQTP 3 PLQRGGVL 
F I GANESVS LKLTE PKDLLI FRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAGSIGAS PAAPCCSESGDERKN 
LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 
SLKKLDKLIEORTVSKMQliEEQVLTlSSEIPKRlRSALKNAEES 
KOFLNQFLEOETHLFSAINSHLLTAQPKKDDLGTMISOIEEIER 
H LiAYLKW I S 0 1 E ELS DN I QQYLMTN>JVPE AA S TLVS MAELD I KL 
OES SCTHLLG FMRATVKFWHKI LKDKLTS D FEE I LAObHWPFI A 
PPOSQTVGLSRPASAPEIYSYLETLFCQLLKLQTSHELLTEPK\ 
HSQKNTLFLP PLLSS / V)P IQVMLTPLQKRFR YHFRGNRQTNVLS 
KPEWYLAQVLMW3GNHTEFLDEKIQPILDKVGSLVNARLEFSRG 
LMMLVLEKLATD1PCLLYDDNLFCHLVDEVLLFERELHSVHGYP 
GTFASCMHILSEETCFORWLTVERKFALQKMDSMLSSEAAWVSQ 
YKDITDVDEMKVPDCAETFMTLLLVITDRYKJvILPTASRKLOFLE 
LQKDLVDDFR I RLTQ VMKEETRASLG FR Y CA I LNAVNY I STVLA 
DWADNVFFLQLQOAALEVFAENNTLSKLOLGOLASMESSVFDDM 
I NLLERLKHDMLTRQVDH V FRBV K.DAAKLY K KERWLS LPSQSEQ 
AVMSLSSSACPLLLTLRDHLLQLEQQLCFSLFKIFWQMLVEKLD 
V Y I YQE 1 1 LANH FN EGG AAQLQ FDMTRNL F P Ij FSHY CKR PEN Y F 
KHI KEACI VLNLNVGSALTAGKDVLPVQLOGS FPAT 


6103 


207 


2523 


ESNSTMTTYLEFIQONEERDGVRFSWNVWPSSRLEATRMWPVA 
ALFTPLKERPPLPP1QYEPVLCSRTTCRAVL-NPLCQVDYRAKLW 
ACNFCYQRNQFPPS YAG I S ELNQPAELLPQFS S I EY WLRGPQM 
PL! FLYWDTCMEDEDliOAIiKESNQMSbSLLPPTALVGL I TFGR 
MVQ VHELGCEG J SKS YVFRGTKDLSAKQLCEM LGLS KVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELORDPWPVPQGKR 
PLRSSGVALS I AVGLLECTFPNTG/iRIMMFI GGPATOGPGMWG 
DELKTPI RSWHDI DKDNAKYVKKGTKHFEAIiANRAATTGHVI DI 
YACALDQTGLLEMKCC PNLTGGYMVMGDS FNTSLFKQTFQRVFT 

VT^MUfinPlfMPrnflTT T?l VTOD\P7IfTCr , &TrDrVOT TCI c t>r*\7c 
iUJrIrloyr lu^lj r oo 1 1j£ J. Jyl JrK \Lt J. JVJ.ovrrlxvjr'^ V^LiJnoH.VjI'UVo 

ENEIGTGGTCOWKICX31^PTTTLA1YFEVVNOHNAP1POGG\RG 

A\ I Q F VTQ Y \ OHS SGQRR I RVTT 1 ARN\ WADAQTQI QN I AAS FD 

QEAAAILMARLAIYRAETEEGPDVLRWLDRCLIRL.CQKFGEYHK 

DDPSSFRFSETFSLYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 

FMRQDLTQSL3 MI QPI LYAYSFSGPPEPVLLDS SSI LADR I LLM 

DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHLLQAPVDDAQE 

I LHSRFPMPRY I DTEHGGSQARFLLSKVNPSOTHNNMYAWGOES 

GAP I LTDDVSLQVFMDHLKKLAVSSAA 


6104 


124 


732 


KVSEYI ILSKDKI LFHAIiAMLVLWSPWSAARGVLRNYWERLLR 
KLPQSRPGFPSPPWGPAIAVQ\AQPCLQSQQK1 P VEVKRI /RSL 
LDSI FWJ/JAAPKNRRTIEVNRCRRRNPQKLI KVKNNI DVCPECGH 
LKQKHVLCAYCYEKVCKETAEIRRQIGKQEGGPFKAPTIETWL 
Y TGETP SEQDQG KR 1 1 ERD R KR PS WFTQN 


6105 


3 


989 


PLHGACTSLVLQRFCHRRPRPCAPARPEPMRRPAAVPLLLLLCF 
GSQRAKAATACGR PRMLNRMVGGQDTQEGEWP WQVS I ORNGSHF 
CGGSLIAEQWVLTAAHCFRNTSETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWV7GWGSPSEEDLLPEPRI LQKLAVPI IDT\ PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGOSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRIIPK 
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Predicted end 
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Ammo acid segment containing signal peptide 
(A=AIanine, OCysteine, D«=Aspartic Acid / E= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Prollne , 0=Glutamine, R=Arginine, 
S=Serine, T= Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, JX= Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRIiDLKRGCARLLTSIESRGRPAAS 
AGliRRDRCALRRMPLRRAPLARATRRRAGSPRRCAPRPRACPC-G 
WSRARHCPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAKNGRCOV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
IPCKETCEK\'DCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
STCV\ VDCTNNAYCVTCNR I CPEPAS SEQYLCGNDGVTYS \ SAC 
HLRKATCL LG RS I GL A YEG KCI KAKS CED I QCTGG K KCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLE VKHSGSCN S I S EDTEEEEEDEDQDYS FPISSILEW 


6107 


623 


168 


SRCSSPRPEPGRGRGK/LSPSEHRKWVEVFKACDEDKKGYLSRE 
DFKTAWMLFG YXPSKI EVDS VMS S I NPNTSGI LLEGFLNI VRK 
KKEAQRYRKEVRH I FTAFDT Y YRG FLTLED FKKAFRQ VAP KLPE 
RT VLEV FR E V \ DR DS \ DGH VSF 


6108 


3 


1346 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 1 
YYFYDLbVYWIGI FCLAS ATGL YS CliAPCVRRLP \ SAS AG ES A i 
LLAPTI PNNSLP Y FHKRPQARMLLLALFCVAVS WWG VFRNEDQ 
WAWVLQDALG3AFCLYML.KTIRLPTFKACTLLLIiVLFLYDIFFV 
FITPFbTKSGSS I MVEVATGPSDSATREKLPMVLKVPRLNSSPL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
AYGVGLLVTFVALALMORGOPALLYLVPCTLVTSCAVALWRREb 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATS PWPAEQS PXSRTSEEMGAGAPMREPGS PAES EGRDQAQPS 
PVTOPGASA 


6109 




1381 


CRSRAGAASGGAILEGTKLRRORVDTNKPLDPLVPSALRAAMLY 
LEDYLEM1ECLPMDLRDRFTEMREMPLQV0NAMDQLEQRVSEFF 
MNAKKNKPEWREEQMAS1 KKDY YKALEDA0E KVQLANQ 1 YDLVD 
RHLRKLDQELAKFKMELEADNAGITE I LERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTTDHIPEKKFKSEALljSTLTSDA 
S KENTLGCRNNN S TAS SNN A YN VNSSQPLGS YN I GSLS S GTG AG 
GI \TMAAAOAVOATAQMKEGRRTSSLKAS YEAFKNNDFQLGKEF 
SMARETVGY SS SS ALMTTLTONASSSAADSRSGRKS KNNKKSSS 
QQSSSSSSSSSLSS GSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNEPRY CICNQVSYGEMVGCDTQDCPI EWFKYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


11 


2464 


ACPSAATMSD0DHSMDEMTAWKIEKGVGGNNGGNGNG3GAFSQ 
ARSSSTGSSSSTGGGGQESQPSPLAIJrAATCSRI ES PNENSNNS 
OGPSQSGGTGELDLTATQLSOGANGWQI I SSSSGATPTSKEQSG 
S STNGSNG S ESS KNRTVSGGQ YWAAAPNLQNQQ VLTGLPGVMP 
N I QY0V1 PQFQTVDGQQLQFAATGAQVQQDGSGQ IQ 1 1 PGANQQ 
I ITNRGSGGNI IAAMPNLLQQAVPLQGLANNVLSGQTQ YVTNVP 
VALNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
GTTI SSASLVSSOASSSSFFTNANSYSTTTTTSNMG IMNFTTSG 
SSGTNSOGOTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGE 
Q\NOQTQAAPKSI.SRPQLVQGG\QALQ\AFQAAPLSGQTFTTQA 
ISQETLQNLQLQAVPNSGPI I IRTPTVGPNGQVSWQTLQLQKLQ 
VQNPGAQTI TLAPMQGVSLGQTSSSNTTLTPIASAAS I PAGTVT 
VNAAQLSSMPGLQTINLSALGTSGigVHPIQGLPLAIANAPGDH 
G AQLGLHGAGGDG 1 HDDTAGGE EGENS PDAQPQAG RRTRREACT 
CPYCKDSEGRGSGDPGKKKOHICHICGCGKVYGKTSHbRAHLRVI 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHliSKHIKTHQNKKGGPGVALSVGTLPLDSGAGSEGSGT 
ATPSALI TTNMVAMEA I CPEG I ARLANSGINVXEGGQFCS PINT 
SANGF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Giutamic Acid, F=Phenylalanine , G=Glycine, 
H=Kistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=ProlAne, Q=Glutamine , R^Arginine, 
S=Serir.e, T=Threonine, V^Valane, 
W= Tryptophan , Y=Tyrosine, X- Unknown ; *=Stop 

LOQOn, /— pOSElOic IlULleOilUc Uci.LLJ.un, 

\=possible nucleotide insertion) 


6111 


1637 


797 ; 


R\TORVRGAMAPWGKRLAGVRGVLLDISGVLYDSGAGGGTAIAG 
S \ r EAV AR LKR S RLKVRFCTNESQKS RAELVGQLQRLG FD I S EQE 
VTAPAPAACQ I LKERGLRPYLLI HDGV\ASEFDQI DTS /STPNC 
W I ADAGE SFS YQNKNNAFQVLME LEKPVL 1 S LGKGR YY KETSG 
LMLD VGPYMKALEYACGI KAEVGG KPS PEFFKS ALQA IGVEAHQ 
AVMIGDDIVGDVGGAQRCGMRALQVRT3KFRPSDEHHPEVKADG 
YVDNLAEAVDLLLQHADK 


6112 


77 


196 


MS S K KS FKS KR FLAKKQKPNR PI LOW I WLKTGNK 1 RHNWK 


6113 


1779 


567 


VJEGRSWAACGVNLQGAWGERSGVRASEAESPGKRADVSWWSRQL ; 
ETMVDHLANTE I NSQRI AA VESCFG ASGQPLALPGRVLLGEG VL 
TKECRXKAKPR I FFLFNDI LVYGS I VLNKRKYRSQHI I PLEEVT 
LELLPETLQAKNRWMIKTAKKSFWSAASATERQEWISHIEECV 
RRQLRATGRPA\STEHAAPWI PDKATD1 CMRCTQTRFSALTRRH 
HCRKCRVWCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 
EEAE EQGAGVPRAASHLARP I CGR PVEMTMTPTRTRRAAGTATG 
PAAWSSTPRGWPGLPSTADPRPAEHLSPSQLHCPGPQEGSSRSC 
PGLR D P I P WKQVQRWGVALSGLPV PFCW TLC P Y G FTAGNAF P FR 
KPQKTHRSW 


6114 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPV?ACWPSPPP\PAEQVQC 
GHLP PHADRRALRLP VAAPARG PGPGHPAGPAG PRPARTPPAS P 
HGPGR P TVP AP P C PLLAATEPT PS R PHQRWTRE DRMLGRGSQ VT 
GRPQWLRGLVLFSL 


6115 


324 


71 


DVCC-RVCAHPHLYTH IHMHI CAHAC\ J HTHAQLC/ ITASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


TGVKP PGRWHAA/ 1 SSSGPVFEGARA\ LQTVKKEEEDESYTPVQ 
AARPOTLNRPGQELFRQIiFRQLRYHESSGPLETLSRLRELCRWW 
LRPDVLSKAQI LELLVLEQFLS I LPGELRVWVQLHN PESGEE\L 
WPCWR$CRGTLMGHPGGTRALP\EPRCALDGYRS\LRSAQ1WSL 
AS PLRS SS ALGDHLE PPYEI EARDFLAGOSDTPAAQMPALFPRE 
GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6117 


1433 


222 


VGVPSPAPPCSWEVGPGGGWTPGILKEGQGGRRTPLLLIjATRTR 
GLLSLFPPAAMHPAAFPbPVWAAVLWGAAPTRGLIRATSDHNA 
S MD FADLPALFGATLS QEGLQGFLVEAH PDNACS P I AP ? PPAPV 
NGS V F I ALLRRFDCN FDLKVLNAQKAG Y G AAVVHNVNSNELLN M 
VWNSEEIQOOIWI PS VFIGERSSEYLRALFVYEKGARVLLVPDN 
TFPLGYYLlPFTGIVGLLVLAMGAVMIARCIQHRKRLQRNRLTK 
\EQLKQI \PTHDYQKGDQYDVCAI CLDEYEDGDKLRVLPCAHAY 
HSRCTVDPWIjTOTRKTCPI LJvOi'VHRbFvjlJtlJWc.t't. i v^y t.h.UlJis 
GEPRDKPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


6118 


1044 


247 


STISCRACTSGATPGAQSHRSARGHAAGGKETAALGMERGKVKK 
KEKE KETQKE K I GE KGREEKVKRKEVEQKl KOE KQE KQERR KG K 
EKEEKRTK0GKE1T3KEKEOFKGOEEXGEKKDSTLTRTPLEPLEK 
N KQ I bVLGLDG AGKTS VLHSLASNRVQH S VAPTQGFHAVC I NTE 
DS OWE FLE I GG S KP FRS YWEMYLSN / ADS LARS FS VG FKQDS QP 
ITWKAKKYLHQLIAANPVLPLWFANKODLEAAYHITDIHEALA 
II 


6119 


1217 


462 


DPRFVTENTTKAPAOERTTQPRSSREGTLRSTMEYLSALNPSDL 
LRS VS N I S SEFGRR V WTS AP PPQR P FR VCDHKRT I R KGLTAATR 
QELLAKALETLLLNGVLTLVLEEDGTAVDSEDFFQLLEDDTCLM 
VL0SG0SWSP7RSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 
DL FG S LNVKATFYGLYS MS CDFQGL\G P KKVLRELLRWTS TLLQ 
GLGHMLLG I SSTLRHAVEGAEQWQQKGRLHSY 
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SEC 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G-Glycine, 
H-Histidine, I=lso2 eucine , K=Lysine, 
L-Leucine, M-Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LERAGGGGLSSRALVGSGACLSLVARANGKGLPRGRKEFVEAVR 
V RYVAFR YRTPRAVCLR LWS CRR E VI MSGRGKQGGKVRAKAKSR 
SSRAGLOFPVGRVHRLLRKGNYAERVGAGAPVYLAAVLEYLTAE 
1 LELAGNAARDNKKTRI I PRHLQLAI RNDEELNKLLGKVTI AQG 
G \VLPN I QAVLLPKKTE SOKDEG ANDP 


6121 


1612 


107 


FVRAQARGSRQPVRRPLLGAGSRLRCRSCGRMEPLKVEKFATAN 
RGNGLRAVTPLRPGELLFRSDPLAYTVCKGSRGWCDRCLLGKE 
KLMRCSOCRVAKYCSAKCOKKAWPDHKRECKCLKSCKPRYPPDS 
VRLLGRWFKLMDGAPSESEKLYSFYDLESNINKLTEDKKEGLR 
0LVMTFQHFMREE1QDASQLPPAFDLFEAFAKVICNSFTICNAE 
MQEVGVGLYPS I SLLNHSCDPNCS I VFNGPHL.LLRAVRDIEVGE 
ELTICYLDMLMTSEERRKOL.RDQYCFECD\CFRCQTQDKDADML 
TGDEOVWKEVQESLKK 2 EELKAHWKWEQVLAMCOAI ISSNSERL 
PDINIYQLKVLDCAMDACINLGLLEEALFYGTRTMEPYRIFFPG 
SHPVRGVQVMKVGKLOLHOGMFPQAMKNLRLAFDIMRVTHGREH 
SLIEDLILLLE/AMRR0H0S1LRERS0REIRRVSLLNALLRSHT 
hC FVS CVNLS YWKFCS V FV 


6122 


2 


2324 


RFRKMADGGAASQDES S AAAAAAADS RMNN P S ETS KPSMESGDG 
NTGTQTNGLDFQKQPV F VGGA1 STAQAQAFLGHLHQVQLAGTSL 
QAAAOSLNVQSKSNEESGDSOOPSOPSQOPSVQAAIPQTQLMIjA 
GGQI TGLTLTPAOQQLLLQOAOAOAOLLAAAVQQHSAS QQHSAA 
GATISASAATPMTQIPLSQP1Q1AQDLQCLQQLQQQNLNLQ0FV 
LVHP TTNLQ PA \ QF 1 1 S QT PQGQQGLLQAN QNLiiTQLPRQS QAN 
LLQSQPRI\TLTSQPATPTCTIAATPIQTLPQSQSTFKRIDTPS 
LEEP \ S DLEELEQFAKT FKORR1 KLGFT\OGDAGLAMVKLYGND 
FSPTTIFRFEALNLSFK!n : MCKLKPLIjEKWLNDAENLSSDSSLSS 
PS ALNS PG1 EGLS RRRKKRTS I EA\N I RVALEKSFLEN\OKPTS 
EE1TM X ADQLNMEKGV 1 RVW FCNRROKEKR I NPPSSGG\TSSSP 
I KAI FPSPTSLVATTPSLVTSSAATTl^TVSPVLPLTSAAVTNLS 
VTGTSDTTSKNTATVI STAP PASS AVTSPSLSPSPSASASTSEA 
SSASETSTTOTTSTPJjSSPIiGTSOVMVTASGLQTA/AQLIjPFXG 
AAQLPANAS1AAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALS PALMSNSTLATIOALASGGSLP I TSLDATGNLVFANAGGA 
PN 1 VTAPLFLNPQNLSLLrTS N P VS LVS AAAASAGNSAPVAS LHA 
TSTSAESIQNSLFTVASASGAASTTTTASKAQ 


6123 


3 


2944 


HLLHRWFGTDMQMINFTTGEF0bTEACPYU5THSEESRFGlLHL 
HUJPLEMKRVGWFTPAE'VGKVTSLILIRNNLTVIDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQI LS IT 
KN FKVEN I GPLP ITVSSLK INGYNCQGYGFEV^DCHQFSLDPNT 
SRDI S I VFTPDFTSSWV1 RDLSLVTAADLEFRFTI*NVTLPHHLL 
PLCADWPGPSWEESFWRLTVFFVSLSLbGVII,IAFQQAQYIlJ4 
EFMXTRQR QNASSSSOCNNG PMDVI S PHS YKSNCKNFLDT YGPS 
DKGRGKNCLPVNTPQSRIONAAKRSPATYGHSOKKHKCSVYYSK 
HKTSTAAAS STSTTTEEKCTS PLGSSLPAAKED1 CTDAMRENV7I 
SLR YASG I NVNLOXNLTLP K^JbLN KEENTLKNTI VFSNPSSECS 
M KEGI QTCM FP KETDI KTS ENTAE FKERELC PLKTS KKLPENHL 
P RN S PQYHQPDL PE I S R KM^GNN(X2VP VKNE VDKCENLKKVDTK 
PSSEKKIH KTS REDMFSE KOD I P FVEQEDPYRKKKLQEKREGNL 
QNLNWSKSRTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSELS 
S D INVRS WCIQESTREVCKADAE I ASSLPAAQREAEGY YQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
^AOHFLPAGDSVSQNDFPSEAPISLNLSHNICNPMTGNSLPQY 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 
NGVPCVI QESAPVHNSFI DWSATCEGQFSSAYCPLELNDYNAFP 
EENMNYANGFPCPADV07DFIDHNS0STWNTPP\NMPAS\WGNA 
QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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ID 
NC: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ot 
amino acic 
sequence 


Amino acid segment containing signal peptide 1 
(A=Alanine, OCysteine, D=Aspartic Acid, E= « 
Glutamic Acid, F= Phenylalanine, G=Glycine, I 
H=Histidine, I=lsoleucine, K=Lysine, 
L>=Leucine, M=Methionine , N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T- Threonine , V= Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=poosible nucleotide insertion) 








PTTEHSD/THMENQA\WCKEVYPGF\NPFRAYMNLDIVJTTT\A 
NRNAN FPLSRDSS YCGNV 


6124 


1S73 


236 


S DEALRLAGERGMGRVOLFE I S LSHGRWYS PGE PLAGTVRVRL 
GAP L P FRAI RVTCI GS CG VSN KAN DTAWVVEEGY FNSSLSLADK 
GSLPAGEHSFPFQFLLPATAPTSFEGPFGKIVHQVRAAIKTPRr 
SKDHKCSLVFYILSPLNLNS I PDIEQPNVASATKKFSYKLVKTG 
SWLTAS TDLRGY WGQALQLHADVENQSG KDTS P WASLLQKV 
SYKAKRWIHDVRTIAEVEGAGVJCAWPRAOWHEOILVPALPOSAL 
PGCSLIHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 
RPGAASWGPTPGG\PSAPPQEEAEAEAAAGGPHFLDPVFLSTKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGvE 
PSLTPES 


6125 


1 


904 


KTCPKLTCAFTVSVPDSCCRVCRGDGEIiSWEHSDGDIFRQPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
IVOIVINNKHKHGQVCVSNGKTYSHGESWHPNLRAFGIVECVLC 
TCNVTKQECKKI HCPNRYPCKYPQKIDGKCCKVCPG / KKAKEEL 

TjfinQ PH"NF VfiV'ET'PFPTMD'VTVF QVFMTrTVP'PTDK'T &T .PTPD OI3AV 
trv?\iZ> r L>i\ jvV* i r UwfiCi 4 vxv v Itivr I iCjLAjH< i 1 tlt<±Jn±jL, 1 L>Kir Jry V 

EVHVMTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


1224 


389 


RLLS E AF CPRSRRRFQMNPEWGOAFVHVAVAGGLCAVAV FTG I F 
DS VS VQVGYEHYAEAPVAGLPAFLAMPKNSLVNMA YTLLGLS WL 
KRGGAMGLGPRYLKDVFAAMALLYGPVQWLRLWTQWRRAAVLDO 
WLTLP I FAW P VAWCLYLDRGWRP \ WLFLSLECVSLAS YGLALLH 
ryu r c v/-viJV7>vn v v t* ±\ v \y\jt±±ji\. x \niui x vj / i ronx I ju/ujovjuo 
CLGFWLKLCDHQLARWRLFCCLTGHFWSKVCDVLQFHFAFLF1. 
THFNTHPRFHPSGGKTR 


6127 


1335 


463 


VLPRRCLVFVVNTMDSSRSPTU5RLDAAGFWQVWQRFDADEKGY 
lEEKELDAFFLHMI^KI^TDDTVMKANLHKVKQQFMTTQDASKD 
GRIRMKELAGMFLSEDENFLLLFRRENPLDSSVEFMQIWRKYDA 
DSSGFJ SAAELRNFLRDJ^FLHHKXAISEAKLEE YTGTMMKI FDR 
NKIKSRLDLNDLARIIiALQENFLIjOFKMDACSTEKRKGDFEKIFA 
YYDV S KTG ALEG P \ E VDG F V KD MMEL VQPS I SG VDLDKFRE I LL 
RHCDVNKDGKlQKSELALCuGLKINP 


6129 


2511 


843 


TCRMSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLEMSSKD 
SPGSLDGRAWEDAQKP0SAV7CGGRKTRVYATSSRRAPPSEGTRR 
GGAARPEKTAEEGPPAAPGSLRHSGPIpGPHACPTALPEPQVTSA 
MSS Q WG I EP JjY 1 KAEPAS P DS PKGS S ETETEP PVAXAPG \ PAP 

trclpghkeeedgegagpgeogggklvlsslpkrlclvcgdvas 
gyhygvas ceackaffkrti qgs ieys cpasnece itkrrrkac 
qacrftkclrvgmlkegvrldrvrggrokykrrpevdplpfpgp 
fpagpiavaggprktaapvnalvshllwepeklyampdpagpd 
ghlp avatlcdlfdre 1 wt i swaksi pgfsslslsdqms vlqs 
vwme vlvxigvaors ltlqde lafaeylvldeegarpaglgelg\ 
aallql vrr lqalrlereky vi .lkaialansdsvhi edeprlws 
s cekllh e alle y e agrag pggg aerrragrllltlpllrqtag 
kvlahf ygvklegkvpmhkl flemleammd 


6129 


1764 


771 


ARFARS/^HEGKMPKKKTGARKKAENRREREKQLRASRSTIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLPICAQCGKTKCMM 
KSSDCVI KHAG VY S TGLAMVG A 1 CDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQVLEAETFKCVS CNRLGQHS CLRCKACFCDDHTRSKVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
Y W KNLSS DX YGDTS YHDEEEDE Y EAEDDE EEEDEGR KDS DTES S 
DLFTNLNLGRTYASGYAHYEEOEN 


6130 


3 


577 


GRGGTMRE Y KVWLGSG \ GVG K S ALTV \ 0 FVTCTF1 EK YDPT I E 
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ID 

NO: 

i 


Predicted 

beginning 

nucleotide 

location 

corresponding 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Amino acid segment containing signal peptide 
(A=Alanine ( C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, 0 = Glut£mine, RssArginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFYRKEIEV\DSSPSVAGISWTQQGTEQF\ASMRDLYIKKGQGC 
ILVYSLVKOQSFQ\DIKFMRDQIIRVKVSEKVPVl\LVGN\SVD 
LES ERE VS S S EGRALAEE WGC P FMETS AK S KTM VDE L F AE I VRQ 
MNYAAQPDKDDPCCSACNI Q 


6131 


3 


1811 


SSPREKTSDSSKRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGS PRHLPSC5 PALLLIiVLGGCLGVF 
GVAAGTRRPNWLLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRASILTGKYPHNHHWNNTLEGNCSSKSWQK3 
QEPNTFPAILRSMCGYQTFF\AGKYLNEYGAPDAGGLEHVPLGW 
SYWYALEKl^SKYYNYTLSIKGKARKHGENYSVDYLTDVLANVSL 
DFLDYKSNFEPFFMMTATP\APHSPWTAAPQYQKAFQNVFAPRK 
KNFNIHG'J'NKHWLIROAKTPMTNSSIOFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSLPIDKRQLY 
EFDIKVPLLVRGPGIKPNQTSKMLVANIDLGPTILD1AGYDLNK 
TQMDGMSLLP I LRGASNLTWRSDVbVEYOGEGRNVTDPTCPSLS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALVJNLQYCEFDDOEVFV 
EVYNLTADPDOITNIAKTIDPELLGKMNYRLMMLQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKKLL 


6132 


96 


1251 


AAGLLPPGLVPEDPRRTRf>3LLPFGIQGPPFALSRPLFSCVESGW 
AWEAMEPEFLYDLLOLPKGVEPPAEEELSKGGKKKYLPPTSRKD 
PKFEELQKPA\VLMEW1NATLLPEHIWRSLEEDMFDGLILHKL 
FQRLAALKIjEAE D I ALTATSQKH KLTWLE AVNRS\ CS WR SG R F 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVIT1 
ESTKSGLKS E KLVEQLTE YSTDKDEP PKDVFDELFKLAPE KVNA 
VKEAI VNFVNQKLDR LGLS VQNLDTQFADGVI LLLLIGQLEGFF ; 
LHLKEF YLTPNS PAEMLKNVTLALELL/ 1 GRG PAQLPC/LALK/ ; 
TIVNKDAKSTLRVLYGLFCKHTQKAHRDRTFHGAPN ; 


6133 


2 


4256 


FVKGSMADTDbFMECEEEELEPWQKISDVIEDSWEDYKSVDKT 
TTVS VS QQPVSAP VP I AAHAS VAGHLS TSTT VS S SGAONS DSTK 
KTLVTLIANNNAGNPLVQOGGQPLILTONPAPGLGTMVTQPVLR 
PVQVMQNANHVTSSPVASOPIFITTQGFPVRNVRPVQNAMNQVG j 
I VLNVQQGQTVR P I TLVF APGTQFVK PTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFT1VIPATLTIRSTVPQSQSQQTKSTPSTSTTP ; 
TATCPTSLGOLAVQS PGQSNQTTNPK1APS FPS PPAVS IAS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSS 1 PVFDLQDGGRKI CPRCNAQFRVTEAL 
RGHMCYCCPEMVEYOKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTN F PK VATS FRCPHCTKRLKNN I R FMNHMKKHVE 
LDQONGEVDGHTI CQHCY RQFSTPFQLQCHLENVHS PYESTTKC j 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCOVCQYRSSLYSEVD j 
VHFRMIHEDTRHLLC?YCLKVFKNGNAFQQHYMRKQKR\NVYH\ j 
CNKCR VQ FLFAKD K I EHKLQHH KTFR KP KQLEGLKPG T KVTI RA 
SRGQPRTVPVSSNDTPPSAXQEAAPLTSSMDPLPVFLYPPVORS 
IQKRAWKMSVKGRQTCLECSFEIPDFP1JHFPTYVHCSLCRYST 
CCSRAYANHMI NNHVPRKS PKYLALFKNSVSGIKIiACTSCTFVT 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR i 
*TVKTtfMYPPP9FPTNKAATVK QAftATPAFPFFT.T.TPT.APAT.PQPA ' 
STATPP PTPTH PQALALPPLATEGAECliNVDDQDEGS PVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCOTEOAAEHFRNPQ 
RRIRRWLRRF0ASOGENLEGKYLSFEAEEKLAEWVLTQREOOLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFI DFVQRQIHNQDLPLSMI VAIDE I SLFL 
DTE\nL,SSDDRKJENALQTVGTGEPWCDVVLAIXJUDGTVLPTLVFY 
RGQMDQPANMPDS I LLE AKESG YS DDEI MELWSTRVKQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIOPL 
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BNSOOCIO: <WO_01S3312A1_l-> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
tc first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sisnai peptide 
(A«»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^lsoleucine, K-Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LGVI GDCPELVQRS FL VAS VLPG PDGNINS PTR N ADMQ E ELI AS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6134 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDV1EDSVVEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHIjSTSTTVSSSGAONSDSTK 
KTLVTLIANNNAGNPLVQC^GQPLILTQNPAPGLGTMVTQPVLR 
P VQVMQNANHVTSS PVASOPI F I TTQGFPVRNVRP VQKAMNQVG 
I VLNVOOGQTVRPI TLVPAPGTOFV KPTVGVPQ VFSQMTPVRPG 
S TM P VR PTTNTFTTVI P ATLTI RS TVPQSQSQQ TKS TPS TS TTP 
TATQPTSLGQIAVOSPGQSNOTTNPKLAPSFPS PPAVS I AS FVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSFGPWVSNNSSAH\ 
GSQRTSGPESSMKVTSS I P VFDLQDGGRKI CPR CN AQFR VTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKP PS PEKTAPVAS 
/ TH P S S TP I PALS P PY / TKVPE PNEN VGDAVQT KL I MLVDDF Y Y 
GRIX3GKVAOLTNFPKVATSFRCPHCTKRLKNNIRFMNHM}ai3JVE 
LDQQNG E VDGHTI CQH CY RQ FS T P FQLQCHLENVH S P YE S TTKC 
KI CE W A FE S E PLFLQHMKDTH K PGEMP YVCQVCQY RS S LY S E VD 
VHFRM I HEDTRHLLCPYCLKVFKNGNAFQQHYMRHOKR\ NVYH\ 
CNKCRVQFLFAKDKIEKKLQHHXTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVC/RS 
IQKRAVRKNSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 
OCSRAYANHKINNHVPRKS PKYLALFKNSVSGI KLACTSCTFVT 
SVGDAMAKHLVFNPSHRSSSI1.PRGLTWIAHSRHG0TRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
STATPPPTPTHPOALALPPLATEGAECLNVDDODEGSPVTOEPE 
LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFOASQGENLEGKYLSFEAEEKLAEWVLTQREQOLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPKARRA 
VAHTLP KDVAENAGLFI DFVQRQ I HNQDLPLSM IVAI DEISLFL 
DTEVLS SDDRKENALOTVGTGEPWCDWLAI LADGTVLPTLVFY 
RGQMDQPANMPDS I LLEAKESG Y SDDE I MELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAVVPAGCSSKJOPL 
DVC I KR TV KNFLHK KW KEQARE MADTACDS DVLLQL VL VWLGE V 
LGVI GDCPELVQRS FLVASVLPGPDGN INS PTRNADMQEELI AS 
LEEQLKLSGEHSESSTPUPRSSPEETIEPESIMQL FEGESETES 
FYGFEEADLDLMEI 


6135 


2 


4256 


F VHGSMADTDLFME CEEEE LE P WQ KI SDV I EDS W E D YNS VDKT 
TTVSVSOOPVSAPVPIAAHASVAGHLSTSTTVSSSGAONSDSTK 
KTLVTLI ANNNAGNPLVQQGGQPL I LTQNPAPGLGTMVTQP VLR 
PVQVMONANHVTSS PVASOP IFI TTQG FPVRNVR P VQNAMNQVG 
I VLNVQOGQTVRP 1 TLVPAPGTQFVKPTVGVPQVFSQMT PVRPG 
STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQLAVQS PGQSNQTTKPKIiAPS FPSPPAVS I ASFVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFR VTEAL 
RGHMCYCCPEMVEYOKKGKS LDS EPS VPSAAKP PSPEKTAPVAS 
/THPSS TP I PALSPPY/ TKVPE PNENVGDAVQTKL1 MLVDDF YY 
GRDGGKVAOLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGEVDGHTIC<)HCYRQFSTPFQLQCHLENVHSPYESTTKC 
KI CE WAFE S E PLFLOHM KDTKKPG EMP YVCQVCQYR S S LY SE VD 
VHFRMIHEDTRHLLCPYCLKVFrJJGNAFQQKYWRHOKR\NVYH\ 
CNKCRVOFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALCEAAPLTSSMDPLPVFLYPPVQRS 
lOKRAVR^SVMGRQTCLSCSFEIPDFPKHFPTYVHCSLCRYST 
CCSRAYANHMI NNHVPRKS PKYLALFKNSVSGI KLACTS CTFVT 
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BNSDOCID: <WO_0153312A1_I_> 



WO 01/53312 



PCT/US00/342A3 



SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acic, e= 
Glutamic Acid, F=Phenylalanine, G^Glycir^ , 
H=Histicine, 1=^3 sol eu cine, K=Lysine, 
L-Leucir-e, M=Methionine, N=Asparagine, 
P«Froline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=£top 
Codon, /^possible nucleotide deletion, 
\spossibie nucleotide insertion) 








s vgdamakhlvfkps h rsssilprgltwiahs rhgqtrdr vhdr 
nvknkyp f psfftnkaatvksagatpaepeelltplapalps pa 
statp p pt fthpoalialppiiategaeclnvddqdegs pvtoe pe 
iiasggggsggvgkkeolsvkklrwlfalccnteoaaehfrhpq 
rrirrwlkrfqasogenlegkylsfeaeeklaewvltqreqqlp 
vneetlfc katki gr s leggfk i s yewavr fmlrk h ltphar ra 
vahtlpkdvaenaglfidfvqrqihnqdlplsmivaideisl.fi* 
dtevlssddrke1cal0tvgtgepwcdvviailadgtvlptlvft 
rgqmdqpa>:k?ds I LLEAKESGYSDDEI melwstrvwqkhtacq 

RSKGMLWiDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKlOPL 
DVCI KRTV KNFLHKKW KEQAREMADTACDSDVLLQLVLVW LGEV 
LGV1GDCPFLVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLME1 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRLGQVAS 
SL FRG EHK £ R GGTX3 R LA S LFSSLE PQ 1 Q P V YVP VPK\ ES ALA S A 
DLEEEIHOKQGQKRKI^SQPGVKVADRKILDDTEDTWSQRXKIQ 
INQEEERLF1NERTVFVGNLPVTCN KKKLKS FFKEYGC 1 ES VRFR 
SLI FAEGTLS KKLiAA 1 KRK IHPDQKN I NAY WFKEESAATOALK 
RNG AQ I ADG FR I R VD LAS ETSSRD KRS VFVGNLPYKVEES A I EK 
HFLDCGS I KAVRI VRDKMTGIGKG FG YVLFBNTDS VHLALKLNN 
SELMGRKLRVMRSVNKEKFKQQNSNPRLKKVSKPKQGLNFTSKT 
AEGH P KSLF 1 GEKAVLLKTKKKGQKXSGRPKKQRKQK 


€137 


141 


265b 


RALR KRRCG PGRR GAlLG SGPGPQR RPGR VPEERPA PFKER X H PG 
MWNWLI VAI»1CLA\LLC-LPGKAQELQGHVS\1 1 LAGECLGDLAKK 
YLWQG\LFCLYLDEAGRGHSFSFHGAALTAPKQGQELN1AKALES 

lscp kdmafshcaeh kdqflqlsqyrqlictaedyqalnkd j eaq 
lchaglreaggifyfsvppfayediarninsscrpgpgawlrw 
lekpfghdhfsaqqlatelgtffqeeemyrvdhylgkqavaoil 
pfrdqnrkaldglwnrhuverveiimketvdaegrtsfyeeygv 
irdvlqnhltevltlvamelpknvssaeavlrhklqvfqalrgl 
qrgsawgoyqs yseqvrrelqkpds fhsltptfagvlvh I DNL 
rwegvpfiuvsgkaldervgyarilfknqaccvqsekhkaaaqs 
qclfrqlvfnlghgdlgspavlvsrnlfrpslpsswkemegppg 
LRLFGS plsc yyays pvrerjdahsvllshi fhgrknffittenl 
lasknfwtplleslahkaprlypggaengrlldfefssgrlffs 
qqqpeqlvpgpgpgpmpsdfqvlrakyresslvsawseeli s kl 
andi eatavravrr fgqfjilalsggss pvalfqqlatahyg fpw 
ahthlwlvdercvplsdp esnfqglqahllqhvri py ynih \ AM 
pvhlqqrlcaeedqgahiyareisalganssfdlvllgmgajxsh 
taslfpqsptgldgeqlwlttspsqphrrmslslplinrakkv 
avlvkgrmkreittlvsrvghepkkwpisgvlphsgqlvwymdy 
daflg 


6138 


4507 


934 


efskltdrwcnavqgvrqrkgdvdglvrqwqdfttsvenlfrfl 
tdtskllsavkgoerfslyqtrslihelknxeihfqrrrttcal 
tleagekllbttdlktkes vgrr i sqlqdswkdmep0laem1 kq 
FQSTV etwdqcekk I KELKS rlqvlkaqsedplpelh edlhn e k 
eli keleqs las wtqnlkelqtm kadltrhvlvedvmvlkeq i e 
hlhrowedlclrvairkqeiedrlntwwfneknkelcawlvqm 
enkvlqtad2 sieemieklqkdcmeeinlfsenklqlkqmgbql 
ikasnksraaeiddklnkindrwqhlfdvigsrvkklketfafi 
qqldknmsklrtwlar i eselskpwydvcddqeiqkrlaeood 
lqrd i eqhs agvesvfni cdvllhdsdacanetecds 1 qqttrs 
ldrrkrnicamsmerrmk1eetvjrlwqkflddysrfedwlksab 
rtaacpns s evlytsakeelkrfeafqrqi herltqlelin kqy 
rrlarenrtdtasrlkqmvhegnqrwdnlqrrvtavlrrlrhft 
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BNSDOC1D: <WO 0153312A1 J_> 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue o? 
amino acic 
sequence 


Predicted end 
nucleotide 
i oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


A/nino zcxc. segment containing sjgnal peptide 
(A=Aianine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycirie, 
H=Histidine, I=Isoleucine , K=Lysine, 
b=beucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T^Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








NOREEFEGTRESILVWbTEMDLQLTNVEHFSESDADDKJ^RQLNG 
FQQEITbNTNKlDQbIVFGEQbIQKSEP\bDAVbIEDELEEbHR 
YCQEVFGRVSRFHRRbTSCTPGbEDEKEASENETDMEDPREIQT 
DSWRKRGESEEPSSPQSbCHbVAPGKERSGCETPVSVDS\IPLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKS 3 SDGHSVmVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLbPPGTDGG 
K£GPRVbNGNPQQEDGGbAG3TEQQSGAFDRWEM3QAQEb\HNK 
bKI KQNbQQbNSD I SAlTTWbKKTE AELEMLKMAKPPSDI0E3 E 
bRVKRbOEIbKAFDTYKALWSVNVSSKEFbCTESPESTELQSR 
LRObSLLWEAAOGAVDSWRGGLRQSbMQCODFHOLSQNLbbWLA 
SAKimROIOUlVTDPKADPRALbECRRELMObEKEbVERQPQVDM 
bOE I SNSLLI KGHGEDCIEAEEKVHVI \EKKLKQLREQVSQDLM 
AbOGTONPASPLPSFDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
GEEETESRVPGSTRPQRSFLSRWRAALPbObLbbbbbbbACbL 
PSSEEDYSCTQAKNF\ARSFYPMbRYTNGPPPT 


6135 


52 


1131 


LGDWVWSRrCGVLETPTSVLRRARARGPCPTDSKWALPRbREGE 
TERRPWEASSWKTL/bAGWIGGAASVIVGHPbDTVKTRbQAGVG 
yGNTLSClRWYRRESMFGFFKGMSFPLASIAVYNSWFGVFSN 
TQR FbSOHRGGEPEAS PPRTLSDLLbASMVAG WS VGLGGPVDb 
IKIRLQMQ7PPVSGRQPRFEVQGSGSCG\EPAY0GPVHCITTIV 
RNEGLAGLYRGASAMbbRDVPGYCbYFIPYVFLSEWlTPEACTG 
PS PCAVWLAGGMAGAI S WGTA.TPMDWKS RLOADGVY LWXYKGV 
LDCISOSYQKEGbKVFFRGITVNAVRGFPMSAAMFbGYEbSbQA 
IRGDHAVTSP 


6140 




136 


RPEbEbKRLRSRSWRPLGVPRRCHRRNWKEPVRAOPLSVTVWAP 
RCQRP/QPPAPEPSSPNAAVPEAIPTPRAAASAAbEbPbGPAPV 
SVAPQAEAEARSTPGPAGSRbGPETFRQRFROFRYQDAAGPREA 
FRQbREb /S PRQWLRPDI \RTKEQ\ I VEMLVQEObLAI LPEAAR 
ARRIRRRTDVRITG 


6141 


2 


984 


AOVGPRSRPCKMPbKbRGKKKAKSKETAGbVEGEPTGAGGGSLS 
ASRA PAR Rb VFHAQLAHG5ATGR VEGFSS I Q Eb YAQIAGA FEIS 
PS E I bYCTbNTPKI DMERLbGGQbGbEDFI FAHVKG I EKE VNVY 
KSEDSbGLTI 7DNGVG YAFI KRI KDGGVIDSVKTI CVGDH I ESI 
NGENIVGWRHYDVAKKLKEbKKEEbFTMKblEPKKAFEIELRSK 
AGKSSGEK3GCGRATbRbRSKGPATVEEMPSETKAK\AIEKIDD 
VLE bYMG I RDI DLATTMFEAGKDKVNPDEFAVAbDETbGDFAFP 
DEFVFDVWGVIGDAKRRGb 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARlGPGVMESKEERAbNN 
b 1 V ENVNQENDE KDEKEQVAN KGE P LALPLNV S E Y C VPRGNRRR 
FRVRQPI bQYRWDIMHRbGEPOARMREENMER J GEEVRQbMEKb 
REKObSHSLRAVSTDPPHHDHHDEFC\bMP 


6143 


2802 


270 


FRMRI FbHCPWNQQlVIWKIWNLbETSbESCKAHbS IQKbLKER\Q 
\QbPVFKHRDSIVSTbKRHRWWAGET\GSGKSTQVPKFLbED 
bbbNE WEASKCN I VCTQPRRI S AVS LANRVCDEbGCENG PGGRN 
SbCG YO I RMESRACESTRXbYCTTG VbbR K LQEDG LLSNVS /HM 
FIVDEV\HER\SVQSDFbbIILKEIbQKRSDLKbIbMSATVDSE 
K FSTY FTH CP3 LR I SGRS YP VE VFH bEDI I E E TGFVLEKDSE YC 
QKFbEEEEEVTINVTSKAGGI KKYQEYI PVQTGAKADbNPFYOK 
YSSRTQHAILYMNPHKINbDblbEbbAYbDKSPOFRNIEGAVbl 
FbPGbAH I QQLYDbLSNDRRFYSER Y KVI AbHS 3 LSTQ DCAAAF 
TbP P PG VR KI VLATNI AETG I T I PDWFVI DTGRTKEK KYHES S 
OMSSbVETFVSKASAbQRQGRAGRVRDGFCFRMYTRERFEGFMD 
YSVPEILRVPbEEbCbHIMKCNbGSPBDPbSKAbDPPObQVISN 
AMNbbRKIGACEbNEPKLTPLGQHbAALPVNVKI GKMbI PGAI F 
GCbDPVATbAAVKTEKSPFTTPIGRKDEADIAKSAIAMADSDHb 
TIYNAYLGWKKARQEGGYKSEITYCRRNFbNRTSbbTbEDVKOE 
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BNSDOCID: <WO_0153312A1.L> 



WO 01/5333? 



PCTAJS00/34263 



SEO 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino cCid 
residue of 
amino acid 
sequence 


Amino acid cecment containing signal peptide 
(A=Aianine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hictidine, I=Isoleucine , K^Lysine, 
L=Leucine, M=Methionine, N=Asparacine, 
P=Proline, Q=Glutaroine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGNRASQTLSFQEIALLKAVLVAGL 
YDNVGKI I YTKS VDVTEKLAC I VETAQGKAQVH PS S VNRDLQTH 
GWLLYOEKIRYARVYLRETTLITPFPVLLFGGDIEVQHRERLLS 
I DGW3 Y FQAPVKI AVIFKQLRVLI DSVLRKKLENPKMSLENDKI 
LQHTELIKTENN 


6144 


1289 

7 


568 


S G FG S MSGQR VD VK VVMLG KE Y VG KT S LVER YVHDR F LVGP YQN 
VSASGGARHGGRGSGGPV1 CTYGPDLFPhVA \ Tl GAA FVAKVMS 
VGDRTVTLG I W DTAGS ERYEAMSR I YYRGAKAA I VC Y DLTDSSS 
FERAKPWVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV 
QD YADN I KAQL FE TS S KTGQSVD £ L FQKVAED Y VS VAAFQ VMTE 
DKGVDLGQKPNPYFYSCCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 
G PMVYAI CYCP LPRLADLEALKVADSKTLLESER ERLFAKMEDT 
DFyGWALDVLS PNLI STSMLGRVKYNIiNSLSHDTATGLI QYALD 
QGVNVTOVFVDTVGMPETYQAR LOOS FPGIEVTVKAKAJDALYPV 
WSAAS I CAKVARDOAVKKW0PVE>CLQDLDTDYG\SGYPNDPQD 
/ 7KA WLKEH VE P VF \ G FP \QF VR F \ S WRTAQT 1 \ LE KEAEDV I R 
EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 
h 


6146 


426 


781 


LKKKGKEKAEAOQVEALPGPSLDQWHRSAGEEEDGPVLTDEOKS 
R/Y PGHE AHDQGG \ WDAROS 1 1 RK WDPETGRTR L I KGDG E VLE 
E I VTKERHREINKQATRGDCLAFCMRAGLLP 


6147 


1 


2304 


GTROLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQliMDS 
ETDMVR03 RALDS DMQTLVYENYN K F I S ATDT I RKMKND FRKME 
DEMDRLATNMAVI TDFSAR I SATLQDRHERI TKLAGVHALLRKL 
0 FLF E LPSRLT K C VEIX3A YGQAW Y OGRAQAVLOQYOKLPS FRA 
10DDCQVITARLAOQLRQRFREGC-SGAPEQAECVELLLALGEPA 
E ELCE E FLAHAR GRLE KELRNLEAE LG PS PPAPD VLE FTDHG \ S 
SG F VGG LCQVAAA Y QELFAAQG PAG AE KLAAFARQLG S R YFALV 
ERRl^QEI^GGDNSli-VRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERLGHHLOGLRAAFLGCLTDVROALAAPRVAGKEGP 
GLAELLANVAS S I !>SH1KAS LAAVKL FTAKEVS FSNKPYFRGEF 
CSOGVREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLIiS 
RL-CLDYETATJ SYI LTLTDEQFLVC'DQFPVTPVS TLCAEARETA 
RRLLTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVOVLPRLAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDM C I WASKGAS S VARAS VR EPQCN KS PRMNTXRAGE CLCPRS 
CS FSAQDYDI FAP I LPVEKQRLRVTQE VRAGLVLVLK I RPQTNS 
CILPLPHSTGSINSDHVPTK 


6148 


305f 


3 53 


VPAVGGTFADGAKGEAEKFH YI YS CDLDI NVQLXJ GSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPLALPVRT 
SYKAFSTRWNWNE WLKLPVKY PDLPRNAQVAL.T I WDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPNCRSOMDQKPTKTPGRT 
SSTLSEDQMSRLAKLTKAHRQGHMVK\T)WLDRLTFREIBMIIJES 
VKRSSNFMYLMGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPDPDM^T F7J7 VFSJfflWN7.PI?CLR SGPSDHDI.FCPYPSPRDO 
LKNIVSYPPSKPPTYEEQDLVWEFRYYLTNODKALTKILTSVIW 
DLPOC-AKQALALLG KWKPMDVEDS LELLS SHYTN PT VRRYAVAR 
LROADDEDIiLMYLLQLVQALKYENFDD I KNGLEPTKKDSQSSVS 
ENVSNSGINSAE3DSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEODLCTFLI SRASKNSTLAN YLYWYVI VECEDQDTQQRDPK 
THEWLNVMRRFSQALLKGDKSVRVT^SLUlAQC/rFVD 
KAVORESGITOKKKNERLOALLGDNEKKNLSDVELIPLPLEPQVK 
3RGI I PETATLFKSALMPAQLFFKTEDGGKYPVIFKHGDDLRQD 
OLI LQ I I S LMDKLLRKENLDLKLTFYKVLATSTKHGFMQFIQSV 
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BNSDOCID: <WO__0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=lscleucine, K=Lysine, 
L^Leucine, M=Met hionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVAEVLDTEGSIQNFFKKYAPSENGPNGISAEVMDTYVKSCAGY 
CVITYILGVGDRHLDNLLLTKTGKLFHIDrGYILGRDPKPLPPP 
MKLNKEM VEGMGGTOS EO YQEFRKQC YTAFLHLRRY SNLI LNLF 
SLMVDANIPDIALEPDKTVKKVQDKFRLDLSDEEAVHYMOSLID 
ESVHALFAAWEQIHKFAQYWRK 


6149 


l 


1413 


rx'dprvrengtanp i kivgktspaskdqrtgkktsvqgqvqkgnd 
esesdfesdppspksseeeeoddeevlogeogdfndddtepenl 
ghrpllmdsedeeeeekhssdsdyeoakakysdkssvyrdrsgs 
g ptqdlnti ^ltsaols s dva vetpkoe fdvfgavpffavraqq 
poqekneknlpqhrfpaagleoeefdvftjcapfskkvnvqecha 
vg peahti pgy pks vdv fgstp fqpfltsts ksesnedlfglv p 
fdeitgsqqqkvkqrsloklssrorrtkodmsksngkrhhgtpt 
stkktlkptyrtperarrhkkvgrrdsossnefltisdskenis 
valtdgkdrgnvlopeesbbdpfgakpfhspd\lswhpp\hogl 
s \di radhn7\vlpgr \ prqnslhgsfhsadvlkmddfgavp/ f 
ltelwositphosoosopv\eldpfgaapfpskq 


6150 


372 


37 


msnikky1 1 dydwkas 2 e i eidhdvmteeklhqinnfwsdseyr 
lnkhgsvlnavlimlao:4alli ai ssdlnaygwcefdwndgng 
oegwppmdgsegiritd: dtsgif 


6151 


1555 


521 


dsnoqsvsgtaastllksfkatiyyogtghvqqfygvtspysqt 

TP P I VQSYAQPSLQY1 QGOOI FTAHPOGVWQPAAAVTTI VAPG 

opqplopsemvvtnnbldlpppsppkpktivlppnwktardpeg 
ki yyyhv i trotqwdp pt w espgddasleke aemdlgtpt yden 
pmk\askkpktaeadtsselakkskevfrkemsqfivqclnpyr 
kpdckvg\ritttedfkklarklthgvmnkelkycknpe\dlec 
kenvkhktkey i kkykok fg avykpkedtefrvtvgpgv7edgws 
gktdsrerkscgpfcstpvstvllkihhpgefnpadvn 


6152 


1366 


648 


NRTWSTPSTWHGVALPF1CSTGPWPVTROITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCKGHGTCVDGHCOCTGHFWRGPGCDELDC 
GFSNCSQHGLCTETGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 
KC EHH CPCD P KTGN CS VS R V KQ CLQ P PE ATLRAG ELS FFTRTAW 
LALTLALAFLLLISTAAlv'LSLLLSRAERNRRLHGDYAYHPLQEM 
NGE PLAAEKEQPGGAHNP FKD 


6153 


2 


3368 


GR VGARS PGRAYALLLLL 1 CFNVGSGLHLQVLSTRNENKbLPKH 
PHLVRQKRAWITAPVAL.LEGEDLSKKNPIAKIHSDLAEERGLKI . 
TYKYTGKG 1TEPPFG1 FV FN KDTGELNVTS I LDREETP FFLLTG 
YALDARGNNVEKPLELR 1 KVLDI NDNEPVFTQDVFVGSVEELS A 
AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT 
GE I YTTS VTLDREEHS S Y TLTVEARDGNGEVTDKPVKQAQVQI R 
I LDVNDN I PWENKVLEGMVE ENQVNVE VTRIKVFDADEI GSDN 
WLANFTFASGNEGGYFH I ETDAOTNEGI VTLIKEVDYEEMKNLD 
FSVIVANKAAFHKS I RSKYKPTPIPIKVKVKNVKEGIHFKSSVI 
S I YVSESMDRSSKGQ 1 1 GNFQAr DEDTGLPAHAR YVKLEDRDNW 
I SVDSVTSE I KLAKLPDFFSRYVQNGTYTVKI VAI SEDYPRKTI 
TGTVLINVED INDNCPTLI E P VOTICHDAEYVNVTAEDLDGH PN 
SGPFS FS V I DKPPGMAE K~WK I ARQESTS VLLQQSEKKLGRSE I Q 
FLI SDNQG FS CPEKOVLTLTVCEVLHGS \GCREAQHDS YVGLGP 

HATM WITT T T T T T T T t//*» 7\ tf'PCPDTIlP'PT DMT IT T» 

WNNEGAP P E DKWP S FL P VDQG G S L VGRNG VGGMAKE ATMKGS S 
SASIVKGQHEMSEMDGRVJ5EHRSLLSGRATQFTGATGAI\MTTE 
TT I TARATG AS RD VAG AO AAAVALN EE FLKN Y FTDKAAS YTEED 
ENHTAKDCLLVYSOEETE SLNAS I GCCS FI EGELDDRFLDDLGL 
KFKTLAEVCLGQKI DIN KE I EQRQKPATETSMNTASHSLCEQTM 
VN S ENTYS SGSSFPVPKS LQE ANAE KVTQE I VTERS VS SRQAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVILGPSOPQSLIV 
TERVYAPASTLVDOPYAICEGTVVVTERVIQPHGGGSNPLEGTQH 
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Amino acid segmez-it containing signal peptice 
(A=Alanine, C«Cysteine, D^Aspartic Acid, E- 
Glutamic Acid, F* Phenylalanine , G=G2ycine, 
H=Histidine, 1=2 soleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagir>e, 
P^Proline, Q=G1i:l amine , R=Arginine, 
S=Serine, T=Threcnine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stof 
Codon, AposeibOe nucleotide deletion, 
\=?possible nucleotide insertion) 








LODVPyVMVRERHSF^.FSSGVQPTLAMPNIAVGONVTVTERVL. 
APASTLOSSYQIPTENSKTARNTTVSGAGVPGPLPDFGLESSGK 
SNSTITTSSTRVTKHETVQHSYS 


6154 


3660 


214£ 


KKKTKMKNTLQKTVN FG AW P KPT 1 SDKSHLLQM V S KLDLTDAKN 
SDTAHIKS1EJTSILNGLQASESSAEDSEQEDERGA0DMDNNGK 
EESKIDHLTNNRNDL3 S KEEQNSSSLLEENKVHADLVISKPVSK 
S P ERLR KD2 E VXS EDTP YEEDEVTK KRKDVKKDTTDKS S KPQ I K 
RGKRRYCNTEECLKTG5PGKKEEKAKNKESLCMENSSNSSSDED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSEVAEKRIKLL 
NNSDERLO^SRAKDRKnWSSIOGQWPKKTLKELFSDSDTEAAA 
SFPHPAPEEGVAEESL0TVAEEESCSPSVELEKPPPVNVDSKP1 
EE KTVE VNDRKAEFPS £ GS N FSA* I PLPYLHLNRLHQSL*QKGS 
RQQS S VTVSEPliAPNOE EVRS I KSETDSTIEVDS VAGELQDLQS 
ERE* LA5RF* CQCELEC** * SARTRTS * KSLYRSEKSERCSGRRK 
FIKKAEKKP*SNSGKQOKEGK 


6155 


869 


121 


HLLPELRGKSWITMKYVFYLGVLAGTFFFADSSVQKJEDPAPYLV : 
YLKSHFNPCVGVLI KPSVA/LAPAHCYLPNLKVMLGNFKSRVRDG 
TE0TINP1Q3 VRYWNYK HSAPQDDLMLIKLAKPAMbNPKVQALN 
P\ PTTNVRPGTVCLLSG LDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSKRNSLCVKFVKVFSRI FGEVAVATVI CKDKIiQGIE 
VGHFMGGDVG I YTNVY K YVSWI ENTAKDK 


6156 


5725 


3984 


GTSTVTKATKKHFS 1 1 LNliliGMLLKKDNQDTRKliLMTWAbEVAV 
VKKKSETYAPLFCLPS FHKFCKGLIiADTLVEDVNI CLOACSSLK 
ALSSSLPDDLbQRCVDVCRVQLVHRGTCIRQAFGKLLKSI PLGV . 
FLSNNNHTEIOEISLAI.RSHMSKAPSNTFHPQDFSD/VISFILY ; 
GNSHRTGKDKWLERLF Y S CQ3LDKRDQSTI PRNLL KTDAVLWQK : 
AI WEAAQFTVLSKLRTF L.GRAQDTFQT IEGIIRS LAGHTLW PDQ 
DVSQWTTADKDEGHGNNOLRLVLLLQYLENLEKLMYNAYEGCAK 
ALTSPPKV1 RTFLYTNROTCQDWLTR I RLS IMRVGLLiAGQPAVT 
VRHG FDLLTE M KT T S LS OGNELEVS I MM WEALCE LHCPEA I CG \ 
IAVWSSSIVGKHLLWINSVAOOAEGRFEKASVEYOEHLCAMTGV i 
DCCISSFDKSVLTliASAGCKSASliKHCLNGESRKSVLSKPTDSE 
PEVINYLGNKACECYI ETADWAAVQEWQNAIHDLKKSTSSTSLN 
LKAD FNYIKSLSS FESGKF VECTEQLELLPGENINLIAGG SKEK 
IDtfKKLLRNK 


6157 


946 


325 


MANRGPSYGLSREVQEK j EQKYDADLENKLVDWI ILQCAEDIEH 
PPPGRAHFOKWLMDGT\?bCKLINSLYPPGOEPIPKISESKMAFK 
OMEQISOFIjKAAETYGVRTTDIFQTVDLWEGKDMAAVORTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASGAGMTGYGMPRQIM*DAA5CP 


615& 


441 


1482 


LGSLI VLSbHCKVI FSSO SLERAMKEKAVDLVP I LAQNPG1AON 
PILEGKDHNONTGVDP1 1 DHVQDRKTD/ SRSKSPHKKRSKSRER 
RKSRSRSHSRDKRKDTKEKIKEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKEREKDREKDKEKDREREREKEHEKDRDKEKEKE 
QDKEKEREKDRSKEIDEKRKKDKKSRTPPRSYNASRHSRSSSRE 
RRKRRSRSSSRSPRTSKT1KRKSSRSPSPRSRNKKDKKREKERD 
H1SERRERERSTSMRKSSNDRBGKEKLEXNSTSLKEKEHNKEPD 
S S V SKE V DDKDAP RT E Eft KI QHNGNCQLNEENLSTKTEAV 


6159 


53 


84 


AVIAPI/HISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVWSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
1 1 TGTP I LTFVKDPObE VNFYTGMDEDSDI AFQFRLH FGK PAI M 
NSCVFGIWRYEEKCYYLFFEDGKPFELCIYVRHKEYKVMVNGOR 
I YNFAHR F P P AS VKMLO V FR D I S LTR VL I SD * GRC VR I TAVQE F 
DVSVSCDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP * F* KVADAQPTES E KE I YKQVNWLKDAEG ILEDLQS 
YRGAGHE I R E A 1 QH FADE KLQE KAWGAWPLVG KLK K FY E FS QR 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F -Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=I*ysane, 
L= Leucine, ^Methionine, N=Asparagine. 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLG ALTST P YS PTQHLEREQALAKQFAEI LH FTLRFD 
ELKMTNPAI QNDFS Y YRRTLSRMR INNVPAEGENEVNN E LANRM 
SLFYAEATPMLKTLS DATTKFVSENKNLP I ENTTDCLS TMASVC 
RVMLETPEYRS RFTNEETVS FCbR VMVGVI I LYDKVHP VGAFAK 
TSKI DMKGCI KVLKDQ P PNS VEGLLNALRYTTKHLNDETTSKQ I 
KS MLQ * QLLT LVN KG 


6161 


4 55 


1569 


PVSGSESSLRRAWASILRLMLGPRVAVSILCEDGISH*LLEKH* 
KSHVLEPLSSLALEEQCLALSLDWSTGKTGRAGDQPLKI I SSDS 
TGQLHLLMVN ETRPRLOKVAS WOAHQFEAW I AAFNYWH P E I VYS 
GGDDGLLRGWDTRVPGKFLFTSKRHTMGVCSIQSSPHREKIIAT 
GS YDEHI LLW DTRNM KQ PLADT PVQGG VWR I KWHPFHKH LLLAA 
CMHSGFKILNCOKAMEERQEATVLTSHTLPDSLVYGADKSWLLF 
RSLQRAPSWSFPSNLGTKTADLKGASELPTPCHECREDNDGEGH 
AR PQSGM KP LTEGMR KNG TWLQ AT AATTRDCG VN PEEAD S AFS L 
LATCSFYDHALHLWEWEGN 


6162 


1 


586 


RT1HATGRAGASPMHRLIVWRLAEANKQHVRC0KCLEFGHWTYE 
CTG KRKYLHR P S RT AEL K KALK E KEN R LLLQQS I GETNV ER KAK 
KKRSKSVTSSSSSSSDS5ASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKE I ELLHS YWTDGLKTLM 


6163 


1081 


785 


R I RS TTEGCAVRLHPTQNTG KAR1 M I LLSVS LGRHWAFT Y KFFL 
TPWFVFFFFFFHRKE*VM0KNPMKSREDEWMEKLNNLHVORAD ; 
MNRL I MNYLVTEG FKE AAE K FRMESG I EP5VDLETLDER 3 K 1 RE 
MILKGQIQEA1ALINSLHPELLDTNRYLYFHL0OQHLIELIROR 
ETEAALEFAOTOLAEQGEESRECLTEMERTLALLAFDS P EES PF 
GDLLHTMQRQ K VW S E VNQ AVLDY ENRESTPKLAKLLKLL-LWAQN 
ELD0KKVKYP KMTDLS KG VI EEPK 


6164 


90 


406 


PCQS PGRS RMRQD KLTG S LRRGGRCLKRQGGGVGT I LSNVLKKR 
SCI SRTAPRXLCTLEPGVDTKLKFTLEPS1/3QNGFQQWY DALKA 
VARLSTGI PKEWRRKVWLTLADHYLHS IAIDWDKTMRFT FNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGGEAEQDRWLKRVLLAYAR 
WNKTVG YCQGFN I LAAL 1 LEVMEGNEGDALKI M I YLI DKVLPES 
YFVNNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANIKESGG 
G YEP PLTNVFTN5CW?LTLFATCLPNQTVIiKIWDS VFFEGSE 1 1 L 
RVSLAI WAKLGEQ I ECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPFPFPOLAELREKYTYNITPFPATVKPTSV£GRHS 
KARDSDEENDPDDEDAVVNAVGCLGPFSGFLAPELQKY0KQ3KE 
PNEEQSbRSNNl AELS PGA1NS CRSEYHAAFNSMMMERMTTDI N 
ALKRQYSRIKKKQQQOVHOVYIRADKGPVTSILPSQVNSSPVIK 
HLLLGKKMKMTNRAAXNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTH I R VHK KNM PRT KSH PG CGDTVG L I DEQNEAS KTNGLG AA 
EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 
VQAKLGALELNORDAAAETELRVHPPCQRHCPEPPSAPEENKAT 
SKAPQGSNSKTP1 FS PFPSVKPLRKSATARNLGLYGPTERTPTV 
HF POMS RS FS K PGGGN SGP * KM VFS SGTMLS RQLPG Y PQ E Y QRN 
GGERFG 


6165 


90 


406 


PC0SPGRSRMR0DKLTGSLRRGGRCL.KRQGGGVGT3LSNVLKICR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGI PKEWRRKVWLTLADHYLHS 1AIDWDKTMRFTFNERS 
NPDDDSMG1 Q I VKDLH RTGCS S YCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVI.PES 
YF^NNLRALSVDMAVFRDLLRMKLPELSQHLDTLQRTANKESGG 
GYEPPLTNVFTMOWFLTLFATCLPNQTVLKIWnsVFFEGSEIIL 
RVSLAIWAKLGE01 ECCETADEFY STMGRLTQEMLENuLLQSHE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKFTSVSGRHS 
KARDSDEENDPDDEDAVVNAVGCLGPFSGFLAPELQKYOKOIKE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid eecment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid / E- 
Glutamic Acid, F=Phenyla2 anine, G=Glycint. 
H=Kistidine, l=Isoleucine , K=Lysine, 
L=Leucine, [^Methionine, ?J=Asparagine, 
P=Proline, 0=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W^Tryptophan, Y^Tyroeine, Xr=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNE EOS LRS NN 1 AEL S PGAI NS CR S E YHAAFNS MMMER MTTD I N 
ALKRQ Y SR I KK KOOQQ VHQVY I RAD KG P VTS I LP SQVN S S P V I N 
HLLLGKKMKMTNXAAKNAVIHIPGKTGGKISPVPYEDLKTKLNS 
PWR TH I R VH KKNM PRTKS HPGCGDT VG L I DEQNE AS KTNG LGAA 
EAFPSGC7ATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA. 
VQAKLG ALELN OR D AAAETELR VH P PCQRHC P EPPSAPEEN KAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPOMSRSFSKPGGGNSGP*KMVFSSGTMLSHOLPGYPQEYORN 
GGERFG 


6166 




1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLOEVKRVLYGKELRK 
LDLPRE AFE AA ERE DFELQG Y AFE AAEEQLRR PR I VHVGLVQNR 
I PLPANAPVAEOVS ALHRRI KAIVE VAAMCGVNI I CFQEAWTMP 
FAFCTREKLPWTEFAESAEDGPTTR FCQKIjAKNHDM VWS PILE 
RDS EHGDVLWN TAW I SNS GAVLGKTRKNH1 PRVGDFNESTYYM 
EGNLGHPVFQTQFGRIAVTJICYGRHKPLNWLMYSINGAEIIFNP 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHODFG Y FYGSSYVAAPDSS RTPGLSRSRDGLLVAXLDL 
NLCQQVNDVWJJ FKKTGRYEMYARELAEAVKSNYS PTI VKE* PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDXQPKKQEKNPVLVSPEFVDEALCACEEYLSN 
LAKMDIDKDLEAPLYXjTPEGWSLFLORYYQWHEGAELRHLDTQ 
V0RCEDIL0OLOAWP0I DMEGDRNI WIVKPGAKSRGRG3MCMD 
HLEEMLKLVNGNPWMKDGKWWQKYIERPLLJFGTKFDLROWF 
LVTDWNPLTVWFYRDSYIRFSTQPFSLKNLDK*APLYLTPEGWS 
LFLQR YYQWHEG A E LRHLDTQVQR CED I LQ0L0AWPQ 3 DM EG 
DRNI W 1 VKPG A K S R GRG I MCMDHLE EMLKLVNGN P WMKDG KWV 
VQKYIERPLLIFGTKFDLRQWFLVrDWNPLTVWFYRDSYIRFST 
QPFSLKNLDK 


6168 


84 


1332 


VWFVPSVSAMPPKK0A0AGGSKKAE0KKKEK1IEDKTFGLKNKX 
G AKOQKFI KAVTHQVKFGQQN PRQVAOS EAEKKLKKDDKKKELQ 
ELNELFKP WAAOK I S KGAD P KSWCAFFKQGQCTKGDKCKFSH 
DLTLERKCEKRS VY I DARDEELEKDTMDNWDE KKLES WNKKHG 
EAEKKKPKTQIVCKHFLEAIENNKYGWFWVCPGGGDICMYRHAL 
PPGFVLKKKKKKKKKEDEISL*DLIERERSALGPNVTKITLESF 
LAWKKRKRQEKI DKLEQDMERRKADFKAGKALVI SGREVFEFRP 
ELVNDDDEEADD7RYTQGTGGDEVDDS VSVND I DI»S LYI PRDVD 
ETG 3 TVAS LER TS TYTSDKDENKLS EASGGRAENGERSDLEEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEELNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLN LPNAV ITRI I KEALPDG VN 3 S KEARSAI S R 
AAS VFVLYATSCANN FAKKGKRKTLN AS DVLS AME EMEFQR FVT 
PLKEALEAYRREOKGKKEASEQKKKDKDKKTDSEEODKSRDEDN 
DEDEERLEEEEONEEEEVDN*KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


S TKVMLPN TGRLAG CTVF I TG ASRG J GKAIALKAAKDGANJ VIA 
AKTAQPHP KLLGT I YTAAEE I EAVGG KALPC I VDVRDEQQI SAA 
VEKA3KKFGGIDILVNNASAISLTNTLDTPTKRLDLMMNVNTRG 
TYLASKACI PYLKKSKVAHI PNISPPLNLNPVWFKQHCGRW* W 


6171 


382 


941 


HFMOSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDOS IQEPKEANSMTAQKQKK* GLRGSRRRHAN 
SGGDlFGDSFAAYFPRVLKQVHQALSLSOEAVSVMDSt^IVRDlLD 
RIATEAGHLAHYSKCVTITSRDIRMAVCLLLPGKMGJCLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARLRREYLYRKAREEAQR 
SAQERKERLRRALEENRLIPTELRREALALCGSLEFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDFSSRLKMFAKELKLVFPGA 
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ID 
NO: 


Predictec 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D-Aspartic Acid, E = 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRMNRGKHEVGALVRACKANGVTDLLWHEHRGTPVGLIVSHLP 
FGPTAYFTLCNVVMRHDIPDLGTMSEAKPHLITHGFSSRLGKRV 
SDILRYLFPVPKDDSHRVITFANQDDYISFRHHVYKKT3HRNVE 
LTE VG PR FELKLYMI RLGTLEQEATADVEWRWHPYTNTARKRVF 
LSTE*AAPRPLGQLL 


6173 


3 


288 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL 
IjQHQR I h tg e kp y vcs VCG KAFSQSS VLS KHRTIHTGE k P YECN 
ECGKAFRVSSDLAOHHKXHTGEKPHECLECRKAFTQLSHLIQHQ 
RIHTGERPYVCPLCGKAPNHSTVLRSHQRVWTGEKPHRCNECGK 
T FS V KR TLLQHQ R I HTGEKP YTCS E CGKAFSDR S VL I OH HNVHT 
G EK P YE CS E CG KT FS HRS TLMNHER I HTE E K? YAC YECG KAF VQ 
HSHLIOHOKVHRKL*PTCVLSVGSAIAGVPTSFSISVSTLERSP 
MCAVYVGR PSARAQSLVNTGQFTQVRSPMS VMSVEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGLGNPGLPGTRHSVGMAVLG01ARRLGVAESWT 
RDRHCAADLALAPLGDAQLVLIiRPRRLMNANGRSVARAAELFGL 
TAEEVYLVHDELDKPLGRLALKLGGSARGHNGVRSC3SCLNSNA 
MPRLRVG3GRPAHPEAVQAHVLGCFSPAEQELLPLLLDRATDLI 
LDHIRERSOGPSLGP*H*WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGOPRAEGLGAFAEGPLRAMAAPVKGNRKQSTEG 
DALD P PAS P K PAGKQNG I QNP I SLEDS PE AGG ER EEEC E REEEQ 
AFLVSLYKFMKERHTPIERVPHLGFKQIfJLWKIYKAVEKLGAYE 
LVTGRRbWKNVYNELGGSPGSTSGATCTRRHY*RLVLPYVRHLK 
GEDDK PLPTS KPRKQYKMAKENRGDDGATERPKKAKEERRMDOM 
M PG KTKADAADPAPLPSQE PPRNS TEOQG LAS G SS VSFVGASGC 
PEAYKRLLS S FY CKGTHGIMS PLAKKKLLAQVS KVEALQCQEEG 
CRHGAEPCAS PAVHLPES PQS P KGLTENSR HRLTPQEGLQAPGG 
SLREEAOAGPCPAAPIFKGCFYTHPTSVLKPVSQHPRDFFSRLK 
DGVliLG F PG KEGLSVKE PQLVWGGDANR PSAFH KGGSR KG I LYP 
KPKACWVSPKAKVPAESPTLPPTFPSSPGLGSKRSLEEEGAAHS 
GKRLRAVS PFLKEADAKKCGAKPAGSGLVSCLLG PALGPVP PEA 
YRGTMLH CPLN FTGTPGPLKGQAALPFS PLVI PAFPAH FLATAG 
PSPMAAGUMHFPPTSFDSALRHRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIGQI IGASGFSESSLFCKWG I HTGAAWKLLS 
GVREGQTQVDTPQ I GDMAYWSHPI DLHFATKGLQGW PRLHFQVW 
SQDS FGRC0LAG YGFCHVPSS PGTHQLACPTWRPLGSWREQLAR 
AFVGGGPQLLHGDTIYSGADRYRLHTAAGGTVHLEIGLLLRNFD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGOEHRL 


6177 


1400 


992" ■ 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDCIKRSDFLGFSGYSPHFVAISTNSEHKMQPSSMQOAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGG I KG VARAAS L VGRRRAGTGMAIiIibCLVCLTAALAHG CL 
HCHSNFSKKr SFyRHHVNFKSWWVGUl PVSGAIjI/rDWbiJiJl MKE 
LHLAI PA K I TREKLDOVATA V YQMMDQL YQGKM YFPG Y FPNELR 
N I FR E0 VH L 1 ON A 1 1 ESRI DCQHRCGI FQ YETI S CNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPG 
THRAAPA FLVLPALR CLEP PHLANLS LEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRLIGIRLTL 
PPPKWDR WNE KRAM FGVY DN I GI LGN FEKHPKEL I RGPI WLRG 
WKGNELQR CI RKRKMVGSRMFADDLHNLNKRIRYLYKH FNRHGK 
FR+ KRKLR TS EKAHLS PWRRETVLFPVRKRLCI FS VI KWGFFGI 


6180 


156 


1833 


DHHI LKAAS TTHVCARGNI FAI PNTRCLEC * ATATPSS LECQN * 
SHLSLCPLPATTSGliTPNSMIPEKERQNIAERLLRVMCADLGAL 
SWSGKEFLKLA0TLVDSGARYGAFSVTEIIX?NFNTLA1jKHLPR 
M YNQVKVKV TCALG SNACLG I GVTCHS QS VG PDSCY I LTA YQAE 
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SSQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
{A=AIanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
^Leucine, M=Methionine, N=Asparagine, 
P=Proiine, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNHIKSYVLGVKGADIRDSGDLVHHWVONVLSEFVMSEIRTVYV 
TDCRVSTSAFSKAGMCLRCSACALNSWOSVLSKRTLQARSMHE 
VIELLN^CEDLAGSTGLAKETFGSLEETSPPPCWNSVTDSLLLV 
HERYEQ I CE F YSRA K KM NLI QS LNKHLLSKLAAI LT PVKQAV I E 
LSNSSQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 
KENFKVH PAHKVAM I LDPQQKLR PVP PYOHEEI I G K VCELINE V 
KESWAEEADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 
PLFOATPDLFQYWSCVTOKHTKLAKLAFV^LLAVPAVGARSGCVN 
NCEQALLI KRRRLLS PEDMNXLM FLXSNML 


6181 


169 


1032 


TRTLIjSPVLLPGPRWKPWRRRPMGPLAIiPAWLQPRYRKNAYLFI 
YYLIOFCGHSMIFTKMTVRFFSFGKDSMVDTFYAIGLVMRi.CQS 
VSLLELLHIYVGIESNHLLPRFLQLTERIIILFWITSQEEVQE 
K Y WC VL F VFWNLLDM VR YT Y S MLS V I G I S YAVLTWLS QTLWM P 
IYPLCVLAEAFAIYOSLPYFESFGTYSTKLPFDLSIYFPYVLKI 
YLMMLFI GM YFTYS K L YS ERRD I LGIFPI KKKKM*STAFQCDTR 
KDRLW1CCSK*NTGSILVEKFLVF 


6182 


1769 


1224 

• 


AS* I DYOLNTLLKEFQLTEENTKLRYLTCSLIEDMAAAYFPDCI 
VRPFGSSVNTFGKLGCDLDMFLDliDETRNLSAHKlSGNFLMEFQ 
VKNVPSER1ATQKILSVLCECLDIIFGPGCVGVQKILNARCPLVR 
FSHQA S G FOCDLTTNN R I ALTSS ELLY I YGALDS R VRAIiV FS VR 
CWARAHSLTS S I PGAW I TNFSLTMMVI FFLORRS PP I LPTLDSL 
KTLADAEDKCVIEGNNCTFVRDLSRIKPSONTETLELLbKEFFE 
YFGNFAFDKNSINlRQGREQNKPDSSPI>YlQNPFETSLNISK>rV 
SOSOLQKFVDLARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 
APNRXSFTKKKSNKF A1ETVKNLLE SLXGNRTENFTKTSGKRT I 
STQT 


6183 


1118 


452 


HLDRY1KSPGSGSSTPAPPSHLLLYLLHP0STRTMGCCGCSRGC 
GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCyVPVCCCKPVCSV?VPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CCSSCCKPCCCQSNCCVPVCCOCKI*GSGPRPSGFSCLVXAFLM 
VP 


6184 


1 


2191 


I VTVREEDGAPAVAPPGVWS RANKR SGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRI I TS 
ELYRS LGDVLR0 VDAKALVRS DFLL VYGD VIS N IN ITRALE EH R 
LRRKL* KNVSVMTMI F KES S PS H PTRCHEDNVWAVDS TTNR VL 
H FOKTQGLRRFAFPLSLFQGSSDGVE VRYDLLDCHI SICS PQVA 
' QLFTDW FDYQTRDD F VRGLLVNEE 1 LGNQI HMHVTAKEYGARVS 
NLKMYSAVCADVIRRWVYPLTPEANFTDSTTQSCTHSRHNIYRG 
P EVS LGHG S I LEENVLLGSGTV IGSN CF ITNS VI GPGCH I E PGD 
NWLDOTYLWQGVR VAAGAQI HOSLLCDNAEVKERVTLKPRSVL 
TSOVWG PNI TLPEGS VI S LHPPDAEEDEDDGEFS DDSGADOEK 
DKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEEELOQNLWGLKI 
WMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 
GKEEN1 S CDNLVLE I NS LKYAYNI SLKEVMQVLSHWLEFPLQQ 
MDSPIjDSSRYCIAIjIiIjPJjJjKAWS PVr K1V i JL Xj<J\f\DnhtJ\Lj/\Al LD 
F FLEH E ALG I SMAKVLMAFY QLE I LAEETI LSWFS QRDTTDXG Q 
QLRKNQQLQRFIQWLXEAEEESSEDD 


6185 


791 


44 


P CTSCV LW ATLHI»P AS TRKAP Q AECGM I S I TEWQK1GVG IT G FG 
I FFILFGTLLYFDSVLLAFGNLLFLTGLSLI 1GLRKTFWFFFQR 
HKLKGTS FLLGG W I VLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFXGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 
REEWEP P PQS PALTHS PTY PGPPQVQKERNGAEQLTSNPQVDSR 
GCQEAEMOTPRRLGWGWYHTLTLYLWEEK 


6186 


569 


238 


V YG I DS SNTNTHGAE ERNRKLKKHWKLCHAOSRLDVNGLALKMA 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=l>ysine, 
L=Leucine, M«Methionine, N=Aspsragine , 
P^Proline, Q=Glutamine, R=Arginine , 
S^Serine, T^Threonine, V^Valine.. 
Ws Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
Vpossible nucleotide insertion) 








KEKKVKNKVKKKADTEEVFNNSPTNQEKMPTSAIIiPDFSGSVIS 
N I RNOMETLHSQPHOEENLCFENS FSL1 NLL P IN AV EPTSS QQ I 
PN R ETS EAN KERR KM TS KS SE SNT YS PLTS F I TADS ELHD 3 1 KD 
LEDCbMVGLHTCGDLAPNTLRI FT SNS EI KG VCSVGCCYHLL.se 
EFENQHKERTQEKWGFPMCHYLKEERWCCGRNARMSACLALERV 
AAGQGLPTESLFYRAVLQDI I KDCYG3TKCDRHVGK I YSKCS S F 
LDYVRRSLKKLGIiDESKLFEKIIMNYYEKYKPRMNELEAFKMLX 
WLAPC I ETLI LLDRLCYLKEQED I AWSALVXLFDPVKS PRCYA 
VIALKKQ0* FPLKQI IRCI£L*DSAGCAEEVSVGDGGPAI»RDAP 
PSGSRVGSRYD 


618-7 


1703 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMKPKASCPA 
AAPLMERKFHVLVGVTGS VAALKLPLLVS KLLDI PGLEVAWT7 
ERAKHFYS PQDIPVTLYSDADEWEMWKSRSDPVLH3 DLRRWADL 
LLVAPLDANTLGKVASGICDKLLTCVMRAWDRSKPLLFCPAMNT 
AMWEHPI TAQQVDQLKAFG YVE I P CVAKKLVCGDEGLGAMAEVG 
T 1 VDKVKE VLFQHSGFQQS * PG I SVMGVPLY S E WVOAKS VKNDV 
GKIGGYPHLLNGGPALSLPRGQACSRLNWTEGPGLSFFQPGEAA 
A 


6188 


* Jc 




N1GVFI C I RCAGIHRNLGVHI SRVKSVNLDQWTQEQ1 QCMQEMG 
NGKAMRLYEAYLPETFRRP01DPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWFEKVKMPOKKEDPQLP 
RKSSPKS TAPVMDLLGLDAPVACS I ANSKTSHTLEKDLDLLAS V 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QLS KDSILSLYGSOTPQMPTQAMFMAPAOMAY PTAYPSFPGVTP 
PNS I MGSMMPPPVGMVAQPGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMr^O^GYMAG^lAAMPQTVYGVQPAQ0LQWNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLSPQMWK 


6 189 


-I X. y 1 


793 


T0LSHARQRPSCQGSOLIALDLOHMDISR0PKW0HVOPVARQV0 
RA0QAQLAEGVAVHLWAGDAWAEVELLOEVGGGKVFAANACDL 
WQDHEGAHAAROATGHALQRVIVQVRRV0PLEAL.*RVPSGLPR 
RVRAFMILHNQITGIGREDFATTYFLEELiNLf YNRITSPQVHRD 
A FRKLRLLR SLDLSGNRLHMLPPGLPRNVHVLKVKP^EIJUUjAR 
GALAGMACLRELYLTSNRLRSRALGPRAWVDL/^HLOLLDIAGNQ 
LTE1 PEGLPES LE YLYLQNN KISAVPANAFDS TPNLKG I FLRFN 
KLA VG S WDS AFRRLKHL0 VLDI EGN LEFGB 2 S KDRGR LGKEKE 
EEEEDEVEEBETR 


6190 


66 


1309 


I L-VGNVSFLLS FAEYVCNCS WGSLNVNRCKGTTGOCE CRPGYQ 
GLKCETCKEGFYLNYTSGLCQPCDCSPHGALS 1 PCNSSGKCQCK 
VGV3 G S I CDRCQDG YYGFS KNGCLP CQCNNR £ AS CD ALTGACLN 
COENS KGNHCEECKEGFYQSPDATKECLRCPCS AVTSTGSCS I K 
SSELE PECDQCKDGYIGPNCNKCENGYYNFDS 1 CRKCC CHGHVY 
P VKTP K I CKPE SG E CINCLHNTTG F W CENCL* G YVHDLEGNCI K 
KVI LPTPEGST I LVSNASLTTSVPTPVINST FTPTTLCTI FSVS 
TSmS TSALADVS WTOFNt I ILTVI 1 1 WVLLKG FVGAVYM YRE 
YQNRKLNAPFWTIELKEDNISFSSYHDSIPNADVSGLLEDDGNE 
VAPNGQLTLTTP I HNYKA 


6191 


1212 


1511 


VNLCHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
MIDW1 KKIWY1 YTMEYYATIKRNEIMFFAGTWKEMEAI ILSKLM 
ODYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKNVLSSIAVYAEDSEPESDGEAGIEAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSROSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEI KI PPEPPG 
RCSNHLODKIQKLYERKIKEGMDMNYIIORKKEFRKPSIYEKLI 
0FCA I DELGTNYPKDMFDPHG WS EDS Y YEALAKAQK I EMDKLEK 
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SSQ 
ID 
NO: 


Predicted 
beg limine 
nucleotide 
location 
corresponding 
to firet 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
tA=;Alanine, C=Cystc:ne, D=Aspartic Acid, E= 
Glutamic Acid, F«Pher.yla3enine, G=G3ycine, 
H=Histidine, I=Iso] tucine, K=Lysine , 
L-Leucine, K=Methicr.ine, N=Asparagine, 
P=Proline, Q^Giutam: ne, R=Arginine, 
S=Serine, T=Threorn ne , V= valine, 
W=Tryptophan, Y=Tyrcsine, X= Unknown, -*=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKIBF VTGT K KGTT'J K ATS T TTTTAS TAVAJD AO K R KS KW 
DSAIPVTTIAQPTILTTTATLPAWTVTTSASGSKTTVISAVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGN KMAG KKNVLS S LA V Y AEDS E PESDG E AG 1 E A VGS AAE E 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDS2TEK 
P E ADDPKDNTEAEKRDPQE LVAS PS ERVRNKS PDEI KI PPEPPG 
RCSNHLQDKI QKLYERKI K5GMDMNY I IQKKKBFRKPS I YEKLI 
QFCAI DELGTNYPKDN FDPKGWSEDS Y YEALAKAOK 1EMDXLEK 
AKKERTK I E FVTGTKKGTTTK ATS TTTTTASTAVADAQKR KS KW 
DSAIPVTTIAQPTILTTTATbFAWTVTTSASGSKlTVISAVGT 
IVKKAKQ 


6194 


3 


950 


TRGCG^n<^mGKKNVLSSLAVyAEDSEPESDGEAGIEAVGSAAEE 
KGGL VS DA YGEDDFS RLGG D EDGY E E E E DEN S RQS EDDDS ETEK 
PEADDPKDNTEAEKRDPQE LVASFSERVRNKS PDE I KI PPEPPG 
RCSN1ILQDKIQKLYERKIKEGMDKNYII0RKKXFRNPSIYEKX.I 
0FCAIDEU5TNYPKDMFDPHGWSEDSYYEALAKA0KI EMDKLEK 
AKKERTKIEFVTGTKKGTTTNATSTTTTTASTAVADAOKRKSKW 
DSAI PVTTIAOPTILTTTAT LPAWTVTTSASGSKTTVI SAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDS PS VRKTKCSGR KHKENVKD 
YYQ KWM E EOAQSL I DKTTAA F QQGK I PPTPFSAPP? AGAM I PP P 
PSLPGPPRFGMMPAPHMGGFPMMPKMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGPPMMRPPAKPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRN I LDNAEQ7 I SNLEARN L3PRLTP LLQEEDSH 
QRLLMG LM VS E L K DHFLR H LOG VE K K K I EQMVLD Y I S KLLDLI C 
HI VETNVJRKHNLHSWVLHFICSRGSAAEFAVFHIMTR X LEATNSL 
FLPLPPGFHTLHTIl^VQCLPLHNLLHCIDSGVLLLTETAVIRL 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDHPMSSNIISR 
NHVTRLLQNYKKQPRNSKINKSSFSVEFLPbNYFlEILTDIESS 
NQALYPFEGHDNVDAEFVEEAALKHTAMLLGL 


6197 


3 


819 


ADPEGTE2AVMSRYTRPPW7SLFIRNVADATRPEDLKREFGRYG 
PIVDVYI PLDFYTRR PRGFA Y VQFEDVRDAEDALYN LNRKWVCG 
RQIE I QFAQGDRKTPGQMKS K ERHPCS PSDHRRSRS PSQRRTRS 
RSSSWGRNRRRSDSLKESRKRRFSYSOSKSRSKSLPRRSTSARQ 
SRTPRRN7GSRGRSRSKSL0KRSKSIGKSQSSSPQK0TSSGTKS 
RSHGRHSDSIARSPCKSPKGVTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


S EAALS PS FI SPACFLLRKLPAiEDGTIiPHPDTLGMN YBGARSE 
RENHAADDSEGGALDMCCSEV: T .PGLPQPI VMEALDEAEGLQDSQ 
REMPPPPPPSPPSDPAQKPFPRGAGSKSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHIWSQNATNLVSSLLTLLKQLEPTAWLDSGTW 
G VPSL LLVFLSGGL VLVTTL VW HLLR TP P E P PTPLP P EDRRQS V 
S RQ PS FT Y S E WMEEK 1 SDD F LDLDP V F ETP VFDCVMD I KP E ADP 
TSIjTVKSMGLQERRGSNVSL'J'IDMCTPGCNEEGFGYLMSPREES 
AREY^LSASRVLQAEELHEKA.LDPFLLQAEFFEIPMNFVDPKEY 
DI PGLVRKNRYKT1 LPNPHSKYCLTSPDPDDPLSSY 1 NANYIRG 
YGGEE KVY I ATQG P I VS TVAI> F WRM VWQEHTP 1 1 VK 3 TN I EEMN 
EKCTEYWPEEQVAYDGVEITVOKVIHTEDYRLRLISLKSGTEER 
GLKH YWFTS W PDQKT PDRAP F LLHLVREVEEAAQQEG PHCAPI I 
VHCSAGIGRTGCFIA7S I CCCQLRQEGWDILKTTCOLRQDRGG 
MIQHCEQYQFVHHVMSLYEKCLSHQS PE 


6159 


144 


1211 


KARENGESSSSWKKGAEDI KK 3 FEFKETLGTGAFSEWLAEEKA 
TGKLFAVKC I PKKALKGKESS J ENE I AVLRKI KHEN I VALEDI Y 
ES PNHLYLVMQLVSGGELFDK J VEKG FYTEKDASTLI ROVLDAV 
Y YLHR KG I VHRDLKP ENLLY Y £ QDEE S KI M I S DFGLE KMEG KGD 
VMSTACGTPGYVAPEXOiAQKPYSKAVDCWSIGVIAYlLLCGYPP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amine acid 
residue of 
amine acid 
sequence 


Amino acic segment containing signal peptide 
(A-Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=lsoleucine , K= Lysine, 
L=Leucine, M=Methionine, K=Asparag ine, 
P=Proline, Q=Glutamine, K^Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y=Tyrosine, ):=Unknown, *«=stop 
Codon, /=poseible nucleotide deletion, 
\=poseible nucleotide insertion) 








FYDENDSXLFEQILit^EYEFDSPYKDDISDSAKDFIRNLKEKDP 
NKRYTCEQAARHPWIAGDTALNKNI KESVSAQIRKNFAKS KWRQ 
AFNATAWRHMRKLH LGSS LDSSNA S VSSSLSLASQKDCASGTF 
HAL* 


6200 


T02 


56 


LP E V PK S LK P R V K P KLCCAQ P AVR V KAR bP KLA V F D LD YTLW P F 
WVDTH\TDPPFHKSSDGTVRDRRGQDVRLYPEVPEVLKRLCSLGV 
PGAAASRTSEIEGANQLLELFDLFRYFVHREIYFGSKITHFERL 
QOKTGIPFSQMIFFDDERRKIVDVSKLGVTCIHIQNGMMLCTLS 
OGLETFAKAOTGPLRSSLEESPFRA 


6201 


2B05 


2383 


GQTPRVRWKMRRSLRAGKRRQTAGRKSKSPPKVP3VIQDDSLPA 
GPPPQ1RILKRPTSNGWSSPNSTSRPTLPVKSLA0REAEYAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPKNVIROPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKNAPOKDRKPKRSTWRFNLDLTHFVE 
DG I FDSGNFEQFLREKVICVKGKTGNLGN VVH IERFKNKITV V$E 
KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFniSQ 
DEDESESED 


6203 


419 


2 550 


RCPRPFATAGAAASRPDRSPPSGISGSEAAAGAGAAAPASOHFA 
TGTGAVOTEAMK0ILGVIDKKLRNLEKKKGKLDDYOERMNKGER 
LNQDQLDAVSKYQEVTNNLEFAKELORSFMALSQDIOKTIKKTA 
RREQLMREEAEQKRLKTVLELOYVLDKLGDDEVRTDLKOGLNGV 
PILSEEELSLLDEFYKLVDPERDMSLRLNEQYEHASIHLWDLLE 
GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEEEEAA 
SAPAVEDQVPEAEPEPAEEYTEQSEVESTEYVNRQFMAETQFTS 
GEKEOVDEW^VETVEVVNSLQQQPQAASFSVPEPHSLTPVAOAD 
PLVRRORVODLMAOMQGPYNFIQDSMLDFENQTLDPAIVSAQPM 
KPTQNMDMPQLVCPPVKSESRLAQPNQVPVQPEATQVPLVSSTS 
EGYTASQPLYQPSHATEQR PQKEPIDQ I OATI SLNTDQTTAS SS 
LPAASQPQVFOAGTSKPLHSSGINVNAkPFOSMQTVFKMNAFVP 
PVNEPETLK00N0Y0ASYNOSFSSOPHOVE0TELQQEQLOTWG 
TYHGSPDOSHOVTGNHQQPPQQNTGFPRSNQPYYNSRGVSRGGS 
RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 
RDYSGYORDGYQ0N 7 FKRGSG0SGPRGAPKGRGGPPRPNRGMPQM 
NTQQVN 


6204 


2933 


787 


CTHNLISLLGGRALIHFNRFLNLKIOEGEAHNIFCPAYDCFOLV 

pgd1 1 ksws kemdkr ylq fdi kaf v enn p ai kw cptpgcdrav 
rltkqgsntsgsdtlsfpllrapavdcgkghlfcweclgeahep 
cdcqtw knwlo k i temkpeelvgvseay eoaanclwlltns xpc 
ancksp3 qknegcnhmqcakckydfcw j cleewkkhsfvkwevi 
yrctryeviohveeqskemtveaekk>:krfqeldrfmhyytrfk 
nhehsyoleqrllktakekmeqlsralketeggcpdttfiedav 
hvllktrri lkcs yp ygfflepkstkke i felmqtdlekvtedl 
aqkvnrpy lrtprhk1 i kaaclvqqkrqe flasvargvapads ? 
eaprrsfaggtkdweylgfaspeeyaefoyrrrkrorrrgdvhs 
llsnppdpdeps estldi peggsssrr pgts wss asm5vlhss 
s lrdytpasrs ekqdsloalssldedd pk i lla1 qlslqes gla 
ldeetrdfi.sneaslgaigtslpsrldsvprntdspraalssse 
llelgdslmrlgaendpfstdtlsskflsearsdfcpsssdpds 
agqdpn i ndnllgni maw fhdmn pqs tali ppattei sadsqlp 
cikdgsegvkdvelvlpedsmfedasvsegrgtqieenpleeni 
pgggkqhpqaw 


6205 


1 


1200 


RAHRG KMALE VG DME DGQLS DS DSDM T VA PS DR PLQL P KVLGG D 
S AMRAFQNTATACAP VSHYRAVES VDS SEES FSDSDDDS CLK KR 
KRQKCFNPPPKPEPFQFGOSSQKPPVAGGKKINNIWGAVLOEON 
ODAVATELGILGMEGTIDRSRQSETYNYLLAKKLRKESQEHTKD 
LDKELDEY^GGKKMGSKEEENGQGHLKRKRPVKDRLGNRPEMN 
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SEO 
ID 
NO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=AIanine, C=*Cysteine, D*=Aspartic Acic, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparsgine, 
P=Proline, Q=Glutainine, R^Arginine, 
S=Serine, T=Threonine, V»Valine, 
WsTryptophan, Y«-Tyroeine, X^Unknown. *-Stop 
Codon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 








Y KGRYEITAEDSOE KVADEI S FRLGE PKKDLI AR VVR 1 1 GNKKA ™" 
ISLLMETAEVEQNGGLFIMNGSRRRTPGGVFLNLLKNTPS JSEE 
OIKDIFYIENOKEYENKKAARKRRTQVLGKKMKOAIKSLKFQED 
DDTSR ETFA S DTNE ALAS LDE SQEGHAEAKLEAE E A I E VDHSH D 
LDIF 


6206 


10 


1442 


II5ERRERSCLHLVCIRCSCDWEMGSVLGLCSMASWIPCLCGS 
AP CLLCRCCPSGNNS TVTRLI YALFLLVG VC VAC VM LI PGMEEQ 
LN KI PG FCENEKG VV PCN I L VG Y KAV YRLCFG LAM F YLLLS LLM 
I KVKS S S Dp R AAVHNG FW FFKFAAAI A 1 1 IGAFF1 PEGTFTTVW 
FYVGMAGAFCFILI 0LVLL1 DFAHSKNES WVEKNEEGNSRCWYA 
ALLSATALNYLLSLVAIVLFFVYYTHPASCSENKAFISVNMLLC 
VGASVMS I LPKIQESQPRSGLLQSSVITVYTMYLTWSAMTNEPE 
TNCNPSLLSI IGYNTTSTVPKEGQSVQWWHAQGI IGLILFLLCV 
FYSSIRTSrWSCVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERDGVTYSYSFFHFMIjFLASLYIMMTLTKWYRYEPSREM 
KS OWTAVWVKI SSSWIGI VLYVWTLVAPLVLTNRDFD 


6207 


2924 


1471 


7 vmaeaatpg ttatts gagaaaataaaas ptpi ptvtapslgag 
gggg3sdgsgggwtkqvtcryfmhgvckegdncryshdlsdspy 
swckyforgyciygdrcryehsxplkoeeatatelttksslaa . 
SSSLSS ivgplvemntgeaesrnsnfatvgagsedwvnai efvp 

GOPY CX3RTAPS CTEAPLQGSVTKE2SE KEQTAVETKKQLCP YAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAORS0H I KSC3EA 
HE XDMELSFAVQRSKDMVCG I CMEWYEKANPS ERRFG I LSNCN 
HTY CLKC I R K W RS AKQ FES K I IKS CPE CR I TSN F VI P S E Y WVEE 
KEEKQKLILKYKEANSNKACRYFDEGRGSCPFGGNCFYKKAYPD 
GRREEPQROK VGTS SR YRAQRRNHFWBLI EERENSNPFDNDEEE 
WTF3LGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6206 


2924 


1471 


T VMAEAATPG TTATTS GAGAAAATAAAAS PTP I PTVTAPSLGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCXYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
SSSLSS I VG P LVEMNTG E A ES RNSNFAT VG AG S EDW VNA I E FV P 
GO P Y CGRTA PS CTEAPLOGSVTKEE SEKEQTAVETKKQLCP YAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAORSOH2KSCIEA 
KEKDMELSFAVORSKDNVCGICMEWYEKANPSERRFGILSNCN 
KTyCLKCIRKWRSAKOFESKIIKSCPECRITSNFVIPSEYWVEE 
KEEKOKLILK Y KSAMSNKACRYFDEGRGSCP FGGNCFYKHA YPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIEERENSNPFDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


E R LCFPCM0S K I Y S YMS PNKCSGMR FP LQE EN S VTKHE V KCQGK 
PLAG I YRKREE KRNAGNAVRSAMKSEEQKI KDARKGPLVP FPNQ 
K£ E AAEPPKT P P S S CDS TNAAIAKQALKKP I KGKQAPRKKAQGK 
TQ0NR KLTDF Y P VR RS SRKSKAELQS E E RKR I DEL I ESGKE EGM 
KI DLZ DGKGRG V I ATKQFSRGDFWEYHGDLI EI TDAKKRE ALY 
AQDPSTGCYKYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 
TKLHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPWL 
KH 


6210 


3761 


387 


IFGMSKLRMVLLEDSGSADFRRHFVNLSPFTITWLLLSACFVT 

SVI CNQLGCPTAI KAPGWANSSAGSGRI WMDHVS CRGNESALWD 
CKHDGWGKHSNCTHOODAGVTCSDGSNLEMRLTRGGNMCSGR I E 
IKFOGRWGrvCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGP 3 W FDDL I CNGNE S ALWNCKHQG WG KHNCDHAEDAGV I CS KG 
ADLSLRLVDGVTECSGRLEVRFQGEWGriCDDGWDSYDAAVACK 
QLGCPTAVTA J G R VNAS KG FGHI WLDS VS CQGH E P AVWQC KHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVE 1 ORL 
LGKVCDRGWGLKEADVVCRQLGCGSALKTSYQVYS K IQATNTWL 
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BNSDOCID: <WO 0153312A1 J_> 
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SEO 
ID 
NO: 


Predictec 
beginninc 
nucleoti at 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, C= Cys tcme, D=Aspartic Acid, E== 
Glutamic Acic, F=Phenylalanine , G=Giycine, 
H=Histidine, I-Iscj eucine, K=Lysine, 
L=Leucine, {^Methionine, N=Asparagine , 
P=Proline, Q=Gluta:nine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W^Tryptpphan, Y=Tyrosine, X= Unknown , *=Stop 
Codon # /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLSSCNGNETSbWDCKNWOWGGLTCDHYEBAKITCSAHRSPRLV 
GGDXPCSGRVEVKHGDTWGS1CDSDFSLEAASVLCRELQCGTW 
S I LGGAK FGEGNGQ I WAEE FQCEGHE SH LSLCPVAPRPEGTCSH 
SRDVGWCSRYTEIRLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCQQbKCGVALSTPGGAR FG KGNGQI WRKMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACJESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAHWCROLGCGEAINATGSAHFGEGTGPlWliDEMKCN 
GKESRIWOCK SKG W GQQNCRHKED AGV I C S E FMS LRLT S E AS RE 
ACAGRLEV F YNGAWGTVGKS SMSETTVGWCRQLG CADKGKI NP 
ASLDKAMS I PMWVDNVQCFKGPDTLWQCPSS PWEKRLAS PSEET 
WITCDNKIRLOEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQOLGCGPALKA.FKEAEFGQGTG ?I WLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPOKATTGRSSROSSFIA 
VGI LGWLLAI FVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNSCLNABDLDLKKS SGGHSEPH 


6211 

• 


376J 


387 


lFGMSKLRMVLLEDSGSADFRRHFVNLSPFTlTWLLLSACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
S VI CNQLGCPTAI KAPGWANSSAGSGRI WMDIIVS CRGNE S ALWD 
CKHDGWGKHSNCTHQODAGVTCSDGSNIrEMRLTRGGKMCSGRIE 
IKFOGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGP I WFDDLI CNGNESALV7NCKHQGWGKHNCDHAEDAGV1 CSKG 
APLSLR1.VDGVTECSGRLEVRFOGEWGT3CDDGWDSYDAAVACK 
OLGCPTAVTAIGRVNASKGFGHIWLDSVSCOGHEPAVWQCKHHE 
WGKHY CNHN EDAGVTCSDGS DLELRLRGGGSRCAGTVE VE I ORL 
LGKVCDRG WGLKEADWCROLGCGSALKTS YOVYS K IQATNTWL 
FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAKREPRLV 
GGDI PCSGRVEVKHGDTWGS I CDS DFSLEAAS VLCRELQ CGT VV 
SILGGAHFGEGNG0IWAEEF0CEGHESH1.SLCPVAPRPEGTCSH 
SRDVGWCSRYTE1RLVWGKTPCEGRVELKTLGAWGSLCKSHWD 
IEDAHVLCOOLKCGVALSTPGGARFGKGNGOIWRHMFHCTGTEQ 
HMGDCPVTAbGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTI PEESAVACI ESGQLRLVNGGGRCAGRVEI YHEGSWGTICD 
DSWDLSDAHWCROLGCGEAINATGSAHFGEGTGPIWLDEMKCN 
GKESRIWOCHSHGWGCQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
ASL.DKAMS 1 PMVTVDNVQCPKGPDTIjVJOCPSSPMEKRLAS PSEET 
WITCDNK I RLQEG ?TS CSGR VE I WHGGS WGTVCDDS WDLDDAQV 
VCQQLGCGPALKA FKEAEFGQGTGP I WLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNC'J'DISVOKTPOKATTGRSSROSSFIA 
VGILGVVLLA1FVAL.FFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNS CLNADDLDLMNSSGGH S EPH 


6212 


1 


1134 


t i/Tiic'T D npr , A'\ t\ACVC dpaptrA dd G ff'fOT'NI'Df^ PDQCT.P'RZi'FRT? 
ljKWtiijKPoL>AVWL5 J i^Ku/io J uArKoL.V»\. y i Wro r r"o DLiKluir ivK 

RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEV7IIEKPPAERHMISSWE 
OKNK CVMPEDVKN F YLWTTKGFHMTWS VKLDEHI I PIjGSMAINS I 
SKLTQLTQSSMYS L PNAPTLADLEDDTKEAS DDQPE KPHFDSRS 
VI FELDSCNG SGKVCLVYKSG KPALAEDTEI WFLDRALYWHFLT 
DTFTA Y Y RLLI THLGLPQWQ YAFTS YG I S PQAXQRVSMYKP I T Y 
NTNLLTE E TDS FVN KLDPS KV FKS KNK I V I P KKKG P VQ P AGGQK 
GPSGPSGPSTSSTSKSSSGSGttPTRK 


6213 




1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCS KPH LE KLTLG I TR I LES S PG VTE VT 1 1 E K P PAERH M I £ S WE 
QKlWC^PEDVKNFYLMTNGfHMTWSVKLDEHIJPLGSMAINSl 
SKLTQLTOSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sicnal peptide 
(AsAianine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H*Kistidine, I»Isoleucine, X=Lysine, 
h- Leucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion; 








VI FELDSCNGSGKVCLVYKSGKPALAEDTE I WFLDRALY WHFLT 
DTFTAYYRLLITHLGLPQWQYAFTSYGISPOAKORVSMYKPITY 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
! GPSGPSGPSTSSTSKSSSGSGNFTRK 


6214 


2 


46G 


KELAP S Al RRAARLGLGPARWQSRAAAFYFVRGFRTGWS FVGWV 
VLGTSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYK 
YRTYAVRRI RDAFRENKNVKDPVE IQTLVNKAKRDbGVIRRQVH 
IGQLYSTDKL I I ENRDMPRT 


6215 


2 


1849 


FVAGG PRGSGSAAETMPE I RVTPLGAGQDVGRS C I LVS 1 AG KNV 
KIiDCG.MHKGFNDDRRFPDFSYlTQNGRr.TnFLDCVIISHFHLDH 
CGALPY FSEMVGYDGP 3 YMTHPTQAI CPI LLEDYRKIAVDXKGE 
ANFFTSQMI KDCMKKVVAVHLHCTVQVDDELEI KAYYAGHVLGA 
AMFQIKVGSESWYTGDYNMTPDRHLGAAWIDKCRPNLlilTEST 
YATTI RDSKRCRERDFLKKVHETVERGGKVLI PVFALGRAQELC 
lLLETFWERMNLXVPIYFSTGl>TEKANHYYKLFI?WTNQKIRia 
FVORNMFEFKHI KAFDRAFADNPGPMWFATPGMIjHAGOSLOI F 
R K WAGNE KNM V I M PGYCV QGTVGH KI LSGQR KLEMEGRQVL E VK 
MOVEYMSFSAHADAKG1MQLVGQAEPESVLLVHGEAKKMEFLKQ 
KI EOELRVNCYM? ANGETVTLPTS PS I PVGI SLGLLKREMAOGL 
LPEAKXPRLLHGTLIMKDSNFRLVSSEOALKELGLAEHQLRFTC 
RVHLHDTRKEQETALRVYSHLKSVLKDHCVQHLPDGSVTVESVL 
LOAAAPSEDPGTKVXAVSWTYQDEELGSFLTSLLKKGLPQAPS 


6216 


11 


3S3 


QTTRPEPRNSAliROSRSKMAWGVSSVSRLLGRSRPOLGRPMSS 
GAHGEEGSARMWKTLTFFVALPGVAVSMIjNVyjjKSHHGEHERPE 
FIAYPHltRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


1178 


TRVGRGESGLXMEVKPPPGRPQPDSGRRRRRRGEEGHDPXEPEQ 
LRKLFI GGLS FETTDDSLREHFEKWGTLTDCWMRD PQTKRSRG 
FGFVTYSC^EEVDAAMC^PHKVDGRVVEPKRAVSREDSVKPGA 
HLTVKKI FVGG1 KEDTEEYNLRDYFEKYGKTET1 EVMEDRQSGK 
KRGFAFVTFDDHDTVDKI WQKYHTI NGHNCEVKKALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSK GS Y GGGDGG YNGFGGDGGN YGGG PG Y5 SRGG YGGGGPG YG 
NQGGGYGGGGGYDGYNEGGNFGGGNYGGGGNYNDFGNYSGOQOS 
NYGPMKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


6218 


1305 


90£ 


SCERKGFIMADDLKRFLYKKLPSVEGLHAIVVSDRDGVPVIKVA 
NDNAPEHALRPGFLSTFAIATDQGSKLGLSKNKS 1 1 CYYNTYQV 
VQFNRLPLWSFIASSSANTGLIVSLEKELAPLFEELRQWEVS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPEADRPHQRPFL* 
I GVSGGTAS GKS TVCEKIMELLGQNEVEQRQRKWI LSQDRFYK 
VbTAEOKAKALKGOYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
TYDFVTHSRLPETTVVYPADVVLFEGILVFYSOE3RDMFHLRLF 
VDTDSDVRLSRRVLRDVRRGRDLEQILTQYTrFVKPAFEEPCLP 
TKKYADVI I PRGTONMVAINLIVQHIQDILNGDJ CKWHRGGSNG 
RS YKRTFSE PGDHPGMLTSGKRSHLESSSRPH 


6220 


227 


764 


EON1 S LEMSCTI EKALADAKALVERLRDHDDAAESLI EQTTALN 
K^VEAMKOYgBrEIQKLNEVARHRPRSTLVMGIOOENRQIRELQQ 
ENKELR7S LEEHQS ALELIMSKYREQMFRLLMAS KKDDPG1 1 MK 
LKEQHSKIDMWRNKSEGFFLDASRHILEAPQHGLERRHLEANQ 
NVH 


6221 


98 


916 


RW I WDLNPVSDGLELRPKYNGI LHCLTT 1 WKLDGLRGLYQGVT p 
N I WGAGLS WGLYFVFYNAI KS YKTEGRAERLEATE YLVSAAEAG 
AMTLCI TN PLWVTKTRLMLQYDAWNSPHRQ YKGMFDTLVKI YK 
YEG VRG L Y KGFV PGLFG TSHGALQFMA YELL KL KYKQH I NRLP E 
AQLSTVEYISVAALSKIFAVAATYPYQWRARLODOHMFYSGVI 
DVI TKTWRKEGVGGFYKGIAPNLI R VTPACC I T FWYENVSH FL 
LDLREKRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing sional peptide 
(A-Alanine, C»Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=ProUne, Q=Glutamine, R=Arginine. , 
S^Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion} 


6222 


2 


2116 


MARELRALLLWGRRLRPLLRAPALAAVPGGKP I bCPRRTTAQLG 
PRRN PAWSKJAGRLFSTQTAEDKEE PLHS 1 1 SSTESVQGSTS KH 
EF0AETKKLLD3VARSLYSEKEVFIRELISNASDALEKLRHKLV 
SDGQALPEMEIKLQTNAEKGTITIQDTGIGMTOEELVSNLGTIA 
RSGSKAFLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVYSR 
SAAPGSLGYQWLSDGSGVFEIAEASGVRTGTKI I IHLKSDCKEF 
SSEARVRDWTKYSNFVSFPLYLNGRR^U^TLQAIWMMDPKDVRE 
WQHBEFYR YVAOAHDKPRYTLHYKTDAPLMIRS I FYVPDMKPSM 
FDVSRELGS S VALYSRKVL1 QTKATDI LPKWLRF I RGWDS ED X 
PLNLSRELLOESALIRKLRDVLQQRblKFFIDOSKKDAEKYAKF 
FEDYGLFMR3GIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
E YASRMRAGTRNI YYLCAPNRHLAEHS PYYEAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CLSEKETEELMAWMRNVLGSRVTNVKVTLRLDTlIPAMVrVLENG 
AARHFLRMQOLAKTQEERAQLLQPTLE I NPRHAL I KKLNQLRAS 
E PGbAObLVDOI Y ENAMI AAGbVDDPRAMVGRLNELLVKALERH 


6223 


3 


715 


DAWARTMAGIWDFQDEEQVKSFLENMEVECKYHC^HEKDPDGCY 
RLVDYLEG I RKNFDEAAKVLKFNCEENQHSDSCY KLGAYYVTGK 
GGLTQDL KAAAR C FLMACEK PGK K S I AACHNVGLLAHDGQVNBD 
GQPDI^KARDYYTRACDGGYTSSCF^SAFIFLOGAPGFPKDKDIi 
ACKY SMKACDLGH I WACANASRMY KLGDGVDKVEAKAEVLKNRA 
QQVHKEQQKGVQPI/TFG 


6224 


1 


133 


LRTI SSMAWG P LLLTLLAHCTGSWAQS VLTQP ? S VS GARI PHEK 


6225 


3259 


938 


bLS CH RLA I C KLjP FS V E S RKTVMG PQGARRQAFLAFG D VTVDFT 
OKEWRIiLSPAQRALYREVTLENYSHLVSLGILHSKPELIRRLEQ 
GE V PWGEERR RRPG PCAG I YAEH VLR P KNLGLAH QRQQQLQ FS D 
QSFOSDTAEGOEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTEIDKVLKGIENSRWGAFKCAERGQDFSRKMMVIIH 
KKAHSRQKLFTCRECHQGFRDESALIibHQNTHTGEKSYVCSVCG 
RGFSbKANLLRHQRTHSGEKPFLCKVCGRGYTSKSYLTVHERTH 
TGEKPYECQECGRRFNDKSSYNKHLXAHSGEKPFVCKECGRGYT 
NKSYFWHKR IHSGEKPYRCOECGRGFSNKSHLI THQRTHSGEK 
PFACRQCKOSFSVKGSLLRHQRTHSGEKPFVCKDCERSFSQKST 
LVYHQRTHSGEKPFVCRECGQGFIOKSTLVKHOITHSEEKPFVC 
KDCGRGF1QKSTFTLHQRTHSEEKPYGCRECGRRFRDKSSYNKH 
LRAHLGEKRFFCRDCGRGFTLKPNLTIHORTHSGEKPFMCKQCE 
KS FS L KANLLRHQWTHSGERP FNCKD CGRGFI L KS TLLFHQKTH 
SG EKP F I CS E CGQGFI WKS NLVKHOLAHSGKQP F VCKE CGRG FN 
W KGNLLTHQRTHSGEKPFVCNVCGOG FS WKRS LTRHHWR I HS KE 
KP F VCQECKRG Y TS KS DLT VHER I HTGERPYECGECGR KFS NKS 
YYSKHLKRHLREKRFCTGS VGEAS S 


6226 


29 


266 


TKVSELLGGSQRLFFLPLWRRLCRCGLGPRVSPMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6227 


2581 


890 


MSASSLLEORPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP 
NNAYTAMSDSYLPSYYSPSIGFSYSLGEAAWSTGGDTAMPYLTS 
YGOLSNGEPHFLPDAMFGQPGALGSTPFIX5QHGFN?FPSGIDFS 
AWGNWSSQGQSTQSSGYSSNYAYAPSSLGGAMIDGQSAFANETL 
NKAPGMNTIDOGMAALKLGSTEVASNVPKVVGSAVGSGSITSNI 
VASNSLPPAT3APPKPASWADIASKPAKQQPKLKTKNGIAGSSL 
PPPPJKHNMDIGTWDNKGPVAKAPSQALVQNIGQPTQGSPQPVG 
Q0ANNSPPVAOASVGQOTQPLPPPPPQPAQLSVQQQAA0PTRWV 
APRKRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRSIN 
N YNPKDFDWNLKHGRVFI 1 KSYSEDD I HRS I KYN I WCSTEHGNK 
RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VWSODKWKGRFDVRWIFVKDVPNSQLRHIRLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTS IFDDFSHYEKRQ 



469 



8NSDOCIO: <WO 01 5331 2A1_I_> 



WO 01/5331? 



PCT/USMV34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue o:' 
amino acic 
sequence 


Amino zc±6 segment containing sign&l peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, E- 
Glutamic Acid; F=Phenylalanine, G=Giycine, 
H^Histidine, I=Isoleucine, Ks=Lysine, 
L= Leucine, M=Methionine, N=Asparacine, 
P= Proline, Q=Glutamine, R=Arginir.t , 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y^Tyrosine, X= Unknown, *=£top 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6226 


47 


1S78 


G R R CR RR G A VMELAQ EAR ELG CW A VE EMG V P V AARA P BSTLRRL* 
CLGQGAD I WAY I LQHVHS QRT V K K I RGN LL W YGKQDS PQVRR KL 
ELEAAVTRLRAEIQELDOSLELMERDTEAQDTAf-lEOARQHTQDT 
ORRALLLRAQAGAMRRQQHTLRDPMQRLQNQLRRLODMERKAKV 
DVTFGSLTSAALGLEPWLRDVRTACTLRAQPLONLLLPQAKRG 
S LPTPHDDK FGTS YQQWLS SVETLLTNH PPGHVLAALEHLAAER 
EAE I RS LCSGDGLGDTEI S R PCA PDQSDSSQTLPSti VHL IQEGW 
R T VG VLV S QR STLL KERQ VLTORJbOG LVEE VEER VhGSSERQVL 
ILGLRRCCLWTELKALHDOSQELODAAGHRQLLLRELQAKQQRI 
LHWRQIiVEEl-QEOVRLLIKGNSASKTRLCRSPGEVLALVQRKW 
PTFEAVAPOSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPS1K0LHPASPRGSSF1ALSHKLGLPPGKASELLLPAAASL 
RQDLLLLCDQRSLWCWDLLKMKTSLPPGIjPTOELLQIQASQEKQ 
OKENLGOALKRLEKIiLKQALERIPELQGIVGDWWEOPGOAALSE 
ELCQGLSLPQWRLRWVQAQGALOKLCS 


6229 


2571 


560 


GPSLLGTRGTPNPARTLQIFFL1IGRRLTGRMAAVDDLQFEEFG 
NAATSLTAN PDATTVN I EDPGETPKHQPGSPRGSGR EEDDELLG 
NDDSDKTELLAGQKKSSPFWTFEYYQTFFDVnTyCVFDRXKGSL 
LP I PGKNFVRLYI RSNPDLYG P F Wl CATLVFAIA2 SGNLSNFLI 
HLGEKTYKYVPEFRKVSIAATI 1 YAYAWLVPLALWGFLMWRNSK 
VMNIVSYSFLEIVCVYGYSLFIYIPTAILWIIPHKAVRWILVMI 
ALGISGSLlJ^TFWPAVREDKRRVAIiATIVTIVLLKMLLSVGCL 
AYFFDAPEMDHliPTTTATPNQTVAAAKSS 


6230 


1723 


60C 


SKWSGRSGKKKMSKLSRSARAGVIFPVGRLMRYLKKGTFKYRIS 
VGAPVYMAAVIEYIiAAElLELAGNAARDNKKAR I AF RHI LLAVA 
NDEELNOLLKGVTIASGGVLFRIHPELLAKKRGTKGKSETILSP 
FPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGDGFTILSSKSLVLGOKLSLTOSDISHIGSKRVEGIVHP 
TTAEIDLKEDIGKALEKAGGKEFLETVKELRKSCGFLEVAEAAV 
SOSSGLAAKFVIHCHIPOWGSDKCEEOLEETIKNCLSAAEDKKL 
KSVAFPPFPSGRNCFPKQTAAOVTLKAISAHFDDSSASSLKNVY 
FLLFDS ESIGI YVQEMAKLDAK 


6231 


145 


870 


hi FS SS TMDRS LRNVIiWS FG FLLL-FTAYGGLQ S LC'S SL YS EEG 
LGVTALSTLYGGMLLSSMFIiPFLLI ERLGCKGTI ILSMCGYVAF 
SVGNFFASVJ YTLI PTS ILLGLGAAPLWSAQCTYLT1 TGNTHAEK 
AGKRGXDMVNOYFG IFFLI FQSSGVWGNIil S SXjVFGQTPSQETL 
PEEQI/TS CGASDCLMATTTTNSTOR PS QQLV YTLIjG I YTGSGVL 
AVLM I AAFLQ PIRDVQRESE 


6232 


3679 


1476 


FVAGTTI^GFWVGTAPLVAAGRRGRWPPCX?I^LSAALJiTLKHVL 
YYSRQCLMVSRNI/3SVGYDPNEKTFDKILVANRGEIACRVIRTC 
KKMGI XTVAIHSDVDASSVHVKMADEAVCVGPAPTS KSYLNMDA 
I MEAI KKTRAQAVHPGYGFLS ENKEFARCIiAAEDWFIGPDTHA 
I QAMGDKl ES KLLAKKAEVNTI PGFDGWKDAEEAVR IARE I G Y 
PVMIKASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
L IEKF I DN PRKI E I Q VJU3DKHGN A LWLNERECS I QR x NQKWEE 
APSIFLDAETRRAMGEQAVALARAVKYSSAGTVEFLVDSKKNFY 
FLEMNTRLOVEHPVTECITGLDLVQEMIRVAKGYPLRHKQADIR 
I NGWAVECRVYAEDPYKSFGLPS I GRLSQYQfcPl»HbPtjVRVDi>G 
IOPGSDISIYYDPMISKLITYGSDRTEALKRMAPALDNYVIRGV 
THN 1 ALLR E VI I NS R FVKGD I STK FLS D VYPDGFKG H MLTKS EK 
NOLIAIASSLFVAFQLRAQHFOENSRMPVIKPDJANWELSVKLH 
DKVHTWASNNGSVFSVEVDGSKI_WTSTWNLASPLLSVSVDGT 
ORTVQCLS k EAGGNMS IQ FLG TVY KVN I LTRLAAELN KFMLE KV 
TEDTS S VLR S PM PG WVAVS VK PGDAVAEGQEI CV J E AMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 


6233 




2654 


HSTRENLN AGNFN F PSEGHLVRS TG PGGS FAKHMVAQC VSPKG P 
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BNSOOCID: <WO <M53312A1J_> 



WO 03/53312 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted enc I 
nucleotide 
locati on 
corresponding 
to first 
amino acid 
residue of 
amino acic \ 
sequence i 


Amino acic segment containing sicnal peptide 
(A^Alanine, CsCysteme, D-Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoj eucine , X^Lysine, 
L=Leucine, I^Methicnine, N=Asparagine, 
F=Proline, 0=Glutan<ine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *«=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRT Y FFGATHVP YLGGDS KLP KKTEO I RLLSQ I YAAV 1 EAV 
LAGIACYAKTSSLTKAKEVAEQTLGSGLDSFEL1PFKAALRSKM 
TFHIHAVNNOGRIVPLDSEDSLSFVKTACMAVYDIPDIjLGGNGC 
LGS WFS ES FLTS Q I LVKEKDGTVTTETS S WLTAAVPRFCS WL 
VEDNEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLYSSNLQSWP 
EEGNVHFFSSGLLFSHCRKGSIIISKDHMNSISFYDGDSTSTVA 
ALLIDFKSSLLFHLPVHFKGSSNFLMIALFPKSKIYQAFYSEVF 
SLWKQODNSGISLKVIQEDGLSVEQKRLHSSAOKLFSAL-SQPAG 
EKRSSLKLLSAKLPELDWFLOHFAISSISOEPVMRTHLPVLLQQ 
AEINTTHRIESDKVIISIVTGLPGCHASEJjCAFLVTLHKECGRW 
MVYRQIMDS.SECFHAAHFQRYLSSALEAQONRSARQSAYIRKKT 
R LL WLQG Y TD V I D WQALQTH PDSNV KA S FT IG A I TACVE PMS 
CYMEHRFLFPKCLDQCSOGLVSNWFTSHTTEQRHPLLVOLQSL 
IRAANPAAAF} LAENGIVTRNEDIEL1 LSENSFSSPEMLRSRYL 
MYPGWYEGKLNAGSVYPLNjVOICVWFC-RPLEKTRFVAKCKAIQS 
S I KPSPFSGNI YHILGKVKFSDSERTMEVCYNTLANSLS1 MPVL 
EGPTPPPDSKSVSQDSSGCOECYLVF3GCSLKEDSIKDWLRQSA 
KO KPQR KALKTRG M LTQQE 3 RSIHVKHH I.EPLPAGYFYNGTQFV 
N FFGDKTDFH P LMDQFMNEY VEEANR E 3 EKYNQELEQQE YHDLF 
ELKP 


6234 


1731 


4 04 


PRVREDMDHKS PGNKGSLVYAGIKSIVKSSLGMVESSRHNWSGL 
D KQSD1 QNLNE ERI LALQLCGW I KKGTDVDVGPFLNSLVQEGBW 
ERAAAVALFNLDIRRAI Q3 LNEGASSE KGDLNLNWAMALSGYT 
DEKNSLKREMCSTLRLQI'N^PYLCVMFAFLTSETGSYDGVLYEN 
KVAVRDR VAFA CKFLS DTQLNRYI EKLTN EMKEAGNLEG X LLTG 
LTKDGVDLMESYVDRTGDVOTASYCMT.QGSPLDVLKDERVQYWI 
ENYRNLLDAWK FWHKRAEFD I HRSKLDP S S KPLAQVFVS CNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRXPLPRC 
/UjCLINMGTPVSSCPGGTKS dekvdls kdkklaofnnwftkchn 
CRHGGHAGHK LS WFRDHAE C P VS ACTCKCMQLDTTGNL VP AETV 
OP 


6235 


1 


571 


E KR DHR L PS W r RAALKV PGR GGR VGTT PE 1AAGG I MATR N PP PQ 
DYESDDDSYEVLDLTEYARRHQWWNRVFGHSSGPMVEKYSVATQ 
1 VMGGVTGWCAGFLFQKVG KLAATAVGGG FLLLQI ASHSG YVQI 
DWKRVEKDVNKAKRQI KKRANKAAPEINNLI EEATEFI KQNIVI 
SSGFVGGFLLGIAS 


6236 


1 


703 | WDONKGAAAGSGLTLPSLPSARFSAGPPTORSRPTKSNMEKHLF 
I NLKFAAKELSRSAXKCDKEEKAEKAKIKKAIQKGNMEVARI'HAE 
1 NAI RQKNQAVN F LRM S AR V DA VAAR VQT AVTMG KVT KS MA GWK 
i SMDATLKTMN1.EKISALMDKFEHQFETLDV0TQ0MEDTMSSTTT 
1 LTTPQNQVDKLLQEMADEAGLDLNMELPOGQTGSVGTSVASAEQ 
i DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGWDVNTALQEVLKTALIHDGLARGIREA^KA 
LDKRQAHLCVU».SNCDEPMYVKLVEALCAEHQINLIKVDDNKKL 
GEWVGLCKI DREGKPRKWGCSCVWKDYGKESQAKDVI EEYFK 
CKK 


6236 


? 


4666 


EEVPTOESVKWEINVIIKNPEIVFVADMTKNDAPALVITTOCEI 
CYKGNLENS TMTAAI KDLQVRACPFLPVKRKGKI TTVLQPCDLF 
YQTTQKGTDPOVIDMSVKSLTLKVSPVIIIvTMlTITSALYTTKE 
TIPEETASSTAKLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMIKMNIDSI FIVLEAG2 GHRTVPMLLAKSRFSGEGKNWSSL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEIDOTEDFRPWNLGIK 
MKKKAKMAI VES DPEEEN Y KVPEYKTV IS FHS KDQLNITLSKCG 
LVMLNNLVKAFTZAATGSSADFVKDLAPFKILNSLGLTISVSPS 
DS FS VLNI PMAKS YVLKNG E SLSMDY I RTKDNDHFNAMTSLSSK 
LFFI LLTPVKK STADKI PLT KVGRRLYTVRHRES GVERS I VCQI 
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BNSOOCID: <WO 0l53312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alonine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=?henylalanine, G=Glyeine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glut amine, R=Arginine, 
SsSerine, T=Threonine , V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DTVEGSKKVTIRSPVQIRNHFSVPL.SVYEGDTLLGTASPENEFN 
1PLGSYRSFIF LKPEDENYQMCEGIDFSEIIKNDGALIjKKKCRS 
KNPSKE5FLINIVPEKDNLTSLSVYSEDGWDLPYIMKLWPPILL 
RNLLPYK I AY Y I EG I ENS VFTLS EGHSAQ I CTAQLGKARLHLKL 
LDYLMHDWKS EYH X KPNQQD I S FVS FTCVTEMEKTDLD I AVHMT 
YNTGQTWA FH S P Y WM VNKTGRMLQ YKADGI HRKH P PNYKKPVL 
FSFQPNHFFNNNKVQLMVTDSELSNQFS3DTVGSHGAVKCKGLK 
MDYQVGVT3DLSSFN3TRIVTFTPFYMIKNKSKYHISVAEEGND 
KWLSLDLEOCIPFWPEYASSKLLIQVERSEDPPKRIYFNKQENC 
I LLRLDNE LGG 1 3 A E VN L\A EHS T V I TFLD YHDGAAT F1»L I NHT K 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRliKWRCRKS 
HGEVTQKDDKMMPIDLGEKTIYLVSFFEGIiORIILFTEDPRVFK 
VTYESEKAELAEQEI AVALQDVG I SLVNKYTKQEVAY 1GI TSSD 
WWETKPKKKARWKPMSVKHTEKLEREFKBYTESSPSEDKVIQL 
DTNVPVRLT PTGHNMK I LQ PH VI ALRRNYLPALKVEYNTSAHQS 
SFRIQIYR3QIQNQ3HGAVFPFVFYPVKPPKSVTMDSAPKPFTD 
V^TVMR^ AGH^OT <^E7 KY^KVT»TOFMDT>RT>DIiGFTYAT>TDLMTF 
AEVTENTEVELFHKD3 EAFKEEYKTASLVDQSQVSLYEY FHISP 
IKLHLSVSLSSGREEAKDSKQNGGLI PVHS LNIjLI»KS I GATLTD 
V0DWFKLAFFELNYQFHTTSDLOSEVIRKYSKQAIKOMYVLIL 
GLDVLGN PFGLIREFS EG V E AF FYE P YQGA I QG P E E FVEG MALG 
LKALVGGAVGCLAC AAS KI TGAMAKGVAAMTMDEDYQQKRREAM 
NKQPAGFREG I TRGGKGLVSGFVS G ITG IVTKP I KGAQKGGAAG 
FFKGVGKGLVGAVAR PTGG I IDMASSTFQGI KRATETSEVESLR 
P P R FFNEDG V I R P YR LRDG TGNQMLQKI QFYRE W I MTHS S S S DD 
DDDDDDDDESDLNH 


6239 


2108 


634 


K PGMAG KG S SG R R PL LLG LL VA VATVH LV I C P Y TK VE E S FN LQA 
TKDLLYHWODLEQYEKLEFPGVVPRTFLGPVVIAVFSSPAVYVL- \ 
SLLEMSKFYSQJLI VRGVLGLGV J FGLWTLQKEVRRHFGAMVATM 
FCWVTAM0FHLMFYCTRTLPNVIjALPWLLA1»AAWLRKEWARFI ; 
WLSAFA3 3 VFR VELCLFLGLLLIiLALGNRKVS WRALRHAVPAG 
ILCLGLTVAVDS YFWRQLTW PEGKVLWYNTVIjNKS SNWGTS PLL 
WFYSALPRGLGCSLLFIPLGLVDRRTHAPTVLALGFKALYSLL 
PHKELRFI 3 YAFPMLN3 TAARGCSYLLNNYKKSWLYKAGSLLVI 
GHLVVNAAYS ATALYVSHFNYPGGVAMQRLHQLVP PQTDVLLHI . 
DVAAAQTGVSR FL0VNSAWRYDKREDVQPGTGMLAYTH3 LMEAA 
PGLLALYR DTKR VLAS WG TTGVSLNliTQLP PFNVHIXJTKLVLL 
ERLPRPS 


C240 


2202 


1176 


HERGDSLKEPTSIAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DFLSSGSRSSSLKSAQGTGFELGOLQSIRSEGTTSTSYKSLANO 
TRNGSLSYDSLL-rPSDSPDFESVQAGPEPDPPI^GYTSPFLSARL 
AQQREAERHPRLVPTGPTHR2PSPVRYDNLSRHIVASLQEREKL 
LROSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQPGVSETEEVALOPLLTPKDEVQLKTTYSKSNGQPKSLGS 
ASPGPGQPPLSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAEEKKRLSLOR E Kl IAR VS I DNRTRALVQALRRTTDP KLC I T 
R VEELTFHLljE FP EG KGVAV KER 1 1 PYLLRLRQI KDETLQAAVR 
EI LAL I G YVD PVKGRGIRILSI DGGGTRGWALQTLRKL V ELTQ 
KPVHQLFDY1 CGVSTGAI LAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVI VGTVKMS WS HAF YDSQTWEN ILKDRKGSALM IETARiaP 
TCPKVAAVST I VIJRG I TPKAFV FRNYGHFPG INSH YLGGCQYKM 
WOAIRASSAAPGYFAEYALGNDLKQDGGLLLNNPSALAWHECKC 
LWPDVPLECIVSLGTGRYESDVRNTVTYTSbKTKLSNVINSATD 
TEEVHI MLDGLLPPDTYFRFNP VMCEN I PLDE SRNE KLDOI^QLE 
GLKYIERNEOKMKKVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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BNSOOCIO: <WO 0153312A1J. > 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sicnal peptide 
(A=Aiar.ine, C=Cysteine, DsAspartic Acid, E~ 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Hiotidine, 1=1 soleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline , 0=Glutamine , R=Arginine, 
S=Se rine, T=Threonine, V=Valine, 
K=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFSKL 


6242 


196 


1310 


QHFLPGAETWS PGAAV CTARRFPGRS LAAFPR PAAPRRAVEMGE 
SSEDIDC'MFSTLLGEMDIiLTOSLGVDTIjPPPDPNPPRAEFNYSV 
GFKDLNESLNAi,EDODLDALKADLVADISEAEORTIQAQKESLQ 
NQHHSASLQAS I FSGAASLGYGTNVAATGISQYSDDLPPPPABP 
VLDLPLF PP PPE PLSQEEEEAQAKADKI KLALEKLKEAXVKKLV 
VKVHMNDNSTKSLMVDERQIARDVLDNLFBKTHCDCN\TDWCLYE 
I YPELQI ERFFEDHENWEVLSDWTRDTENKI LFLEKEEKYAVF 
KNPQNFYLDNRGKKES KETNEKMNAXNKESLLEVRL2 LOSGRKE 
KDVCS IFKS FASEMNGKI 


6243 


1509 


614 . 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TSRASSRRUACGPQTRAGAETRSTAMIRAKSAARDTRRATCRSA 
AGTPSPTTMTCLTDVPTGCAAVEPTARLPAAAWAST1TTGCCPA 
MGQAGAGPAGRKGS E AGGGPGRAHHAHPSPLPR E PRVRTG ? PAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYOHLTFFSIOPQPPPCHAFHPRDPPAGTKROLILVPLK 
GPPILAPILSLTPILSRWSCYFPRSRIAQGWHLS 


6244 


2119 


1745 


FEHAYASOFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSELSK 
FTNPLFEANWLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI I EYNKELOAKVNI LRRQLAELETECGMOESP 


6245 


81 


1148 


LSLRNAKYSFPOELISLFSMTDLNDNICKRYIKMITN3VILSLI 
ICISLAFWI1SMTASTYYGNLRP1SPWRWLFSWVPVL1VSNGL 
KKKSLDHSGALGGLWGFILTIANFSFFTSLLMFFLSSSKLTKVJ 
KGEVKKRLDSEYKEGGORNWVQVFCNGAVPTELALLYK2ENGPG ' 
EIPVDFSKOYSASWMCLSLLAALACSAGDTWASEVGPVLSKSSP i 
RLI TT WE K VP VGTNGGVTWGLVS S L.LGGTFVG I AYFLTQLI FV 
NDLDT S A PQWP 1 1 APGGLAGLIjGS I VDS YU3ATMQYTGLDESTG 
MWNS PTNKARH I AGKP I LDNNAVN LFS S VL1ALLLPTAAWG FW 
PRG 


6246 


1177 


359 


SLWPVJILMDDSLMQISLOLLCVYTANFPNGCSSLCWSSCGQHPV 
OATHRGAVSNSLMLCILKLASQMFLENTTVOOMVFMLLSNLALS 
HDCKGVIQKSNFLONFLSLALPKGGNKHLSNLTILWLKLLLNIS 
SGEDGQOMILRLDGCl,DLLTEMSKYKHKSSPLLPLI,IFHNVCFS 
PANKPKI LANEKV I TVLAACLESENQNAQRIGAAALWALI YNYQ 
KAKTALKS PS VKR R VDEAYSLAKKTFPNSEANPLNAY YLKCLEN 
LVQLLNSS 


6247 


3 


1678 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGLVPjjTDDTSHAGP 
PGPGRALLECDH LR S G VPGGRRRKDWS CS LLVAS LAGAFG S S Fl» 
YGYNLSVyNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 
SI FA2 GGLVGTL J VKMI GKVLGRKHTLLANNG FAI SAALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
OVTAIFICIGVFTGOLLGLPELLGKESTWPYLFGVTWPAWOL 
LSLPFLPnsPRYLLLEKHNEARAVXAFQTFLGKAHVSQEVEEVL 
AESRVORS I RLVSVLELLRAP YVRWQWTVI VTMACYQLCGLNA 
IWFYTNSIFGKAGIPPAKlPYVTLSTGGIETLAAVFSGLVIEHb 
GRRPLLIGGFGLMGLFFGTL7I TLTLQDHAPWVPYLS I VGI LAI 
IASFCSGPGGIPFI LTGEFFQQSQRPAAFI I AGTVNWLSNFAVG 

T.T.FDPTnVCT.ryPV/TT 17I7IVTTr»TTr:a TVT VCVT PPTirWPTV7iTTT 
LiLtC trC J. \/J\S1mJ1 2 t-rJuvr/li iLl JL O.H i I ill . V LiirtL X F\Tit\± ZAC,JL 

S QAFS KRKXAYPPE EKI DS AVTDGKINGR P 


6248 " 


56" 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVONPGAALDLCI 
AAVIKECHLVILSLKSQTLDAETDVXCAVLYSNKNRMGRHKPHL 
ALKQVEOCIiKRLKTWNLEGSIODLFELFSSNENOPLTTKVCWP 
SO P WE LV LM KVLG AC KLLL R LLDCCCKTFLLTV KH1 »GLQE FI I 
LNL VM VG L VS RLWVLY XG VL KRL I LL YE PLFGLLQE VAR IQ PM P 
yFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQS PRAS EETLLGI SKKAKQMKINVQNNVDLGQPVKNKRVF 
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BNSDOCID: <WO_01 53312A1 _l_> 



WO 01/53332 



PCT/USUO/34263 



SEQ 
ID 
NO: 


Predictec 

MJtj M X li i J J. J A 

nucleotic*- 
location 
corresponding 
to first 
amino acic 

amino acic 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing siona^ peptide 

{ A.sfi 1 a *i ^ n*a P-Pv^hpi n»» n-IV«nArf* i r Ar*"!ri P — 

Glutamic Acid, F=Phenylalanine. G=Giycine, 
H=Histidine, I«Isoleucine, K=bysine, 
L=Leucine, M=Methionine» N=Asparacine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v^Valine, 
W= Tryptophan/ Y=Tyrosine, X*Unknown, *=Stop 
Codon, /=poseibie nucleotide deletion, 
\=possible nucleotide insertion) 








KEESSEFDVRAFCNQLKHKATQETSFDFKCSOSRLKTTKYSSQK 
VIGTPHAXSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECI KTSI CNHLLRGSGIK 
TS KHHLRORRSQNKFLRRORKPQRKLQSTLLRE 1 00 FSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKCLLNSGVSMPVIQTKEKMI 
KENLRGIHENETDSWTVMOINKNSTSGT2KETDD3DDIFAIjMGV 


6249 


56 


1773 


V P P PR MMAA VP PG bE PWNR VR I p KAGNKS AVT VQN PGAALD LCI 
AAVIKECHLVILSLKSOTIDAETDVLCAVLYSNHNRMGRHKPHL 
ALKOVEOCLKRLKNMNLEGSIODLFELFSSNENOPLTTKVCWP 
SQ P WELVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFI I 

i kit \»(\jn i/"*t t foot T.ri 7T w/**t n TvftT f t t i/r?fiT rr>f t r"MT>i rn t* mnun 

Li N Ij V MVG h V SRL WVL Y KG VL KR L I LI>YIS PLFGLIjOEVARIQPMP 
YFKDFTFPSDITEFLGOPYFEAFKKKMPIAFAAKGINKLXjNKLF 

lineosprassetllgiskkakomkinvonnvdlgqpvknkrvf 
keessefdvrafcnolkhkatqetsfdfkcsosrlkttkyssok 
v 1 g t pkaks fvqr fre aes ftols e e i qma wwcrs k k bxaqa i 

F LGN KLbK SNRLKHLE AQGT SLP KKIjEC 3 KTS I CNHbbRGSGIX 

tskhhlrqrrsqnkflrrorkporklqstllreiqofsqgtrks 
atdts akw rlshct vhrtdl y pn s kqllksgvsmpvj qtkekmi 
henbrgihenetdswtvmqinknstsgtlkftddiddifabmgv 


6250 


23i 


1306 


LAAbH I MAb PFR KDLE KY KDbDEDELiLiGNbSETE LKQLE T VLDD 
bD P EN Abb PAG FRQKNQTS KS TTG P FDREHLLS YbE KEAbEHKD 
REDYVPYTGEKKGKIFIPKOKPVOTFTEEKVSbDPEbEEAbTSA 
SDTEbCDbAAILGMHNbl TNTXFCNI MGSSNG VDOEHFSNWKG 
E K 1 bP VFDE P PN PTNVEES bK RT KENDAHb V E VNLNN J KN I P I P 
'1' bKD F AKALETNTHV KC FS bAATRSNDPVAT AF AEMbKVNKTbK 
SbNVESNFl TGVGILAbI DAbRDNETLAEbKI DNQRQQbGTAVE 
bEMAKMbEENTNIbKFG YQFTQQGPRTRAANA I TKNNDbVRKRR 
VEGDHC 


6252 


62 


972 


TPG SGPMS AWAAAS bSRAAARCbbARG PGVRAAPPRDPR PSHPE 
PRGCGAAPGRTLHFTAAVPAGHKKWSKVRH3 KGPKDVERSR1 FS 
KLCLNIRIoAVKEGGPNPEHNSNbANIbEVCRSKHMPKSTIETAL 
KMEKSKDT YLLYEGRGPGGS Sbb IEAbSNS SHKCOAD IRH I1»NK 
NGGVMAVGARHS FDKKGVI WEVEDREKKAVNbERAbEMAI EAG 
AEDVKETEDEEERNVPKFICDASSbKQVRKKbPSbGLCSVSCAb 
EFI PNSKVQLAEPDbEQAAHblOAbSNHEDVIHVYDNIE 


62S2 

- 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKbOTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNREbRPGRPKNAYILXKSRISKKPQV 

PKKPREWKNPESQRGbSGAQDPFPGPAPVPVEWOKFCRIDKSR 
KLPHS KAKTRSRbEVAEAEEEETS 1 KAARSEbbLAEEPGFbEGE 
DGEDTAXICOADIVEAVDIASAAKHFDbNbRQFGPYRLNYSRTG 
RHbAFGGRRGH VAAbDWVTKKLMCE I NVMEAVRD I RFbHSEAbL 

a \7 Zl ON P UT .14 T VTiUfVZ T PT .UP T P P f*HP VTJ? 1 »FF T . P FH PT A , AT JV <?P 

TG FbT Y bDVS VG K I VAAbNARAGRbD VMS ONP YNAV I H1X5HSNG 
TVSbWSPAMKEPbAKIbCHRGGVRAVAVDSTGTYMATSGLDHQb 
KI FDbRGTYQPbSTRTLPHGAGHLAFSQRGbbVAGMGDWNI WA 
GOGKASPPSLEQPYLTHRbSGPVHGbQFCPFEDVbGVGHTGGlT 
S MbVPG AG EPNFDGbBSNP Y RS RKQRQE WE V KAb bEKVP AEL I C 
bDPRAbAEVDVI S bEQGKKEQ I ERLG YDPOAKAP FQPKPKQKGR 
SSTASbVKRJCRKVMDEEHRDKVROSbQQQHHKEAKAKPTGARPS 
AbDRFVR 


" 6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKbQTKRKKPRRYWEE 
ETVPTTAG AS PG P PRNKKNR E LR PQR P KNA Y I bKKSR I S K KP QV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWOKFCRIDKSR 
Kb PHS KAKTRSR bE VAEAEE E ETS I KAARS EbbLAEE PG FbEGE 
DGEDTAKlCQADIVEAVDIASAAKHFDbNbROFGPYRbNYSRTG 
RHIAFGGRRGHVAAbDWVTKKbMCE INVMEAVRDI R FbHSEAbb 



474 



BNSDOCID: <W0 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<AeAlanine, C=Cysteine, 3=Aspartic Acid, E= 
Glutamic Acid, F=Phenyj alanine , G=Glycine, 
H=Histidine, I=Isoleuc: ne , K=L.ysine, 
L=Leucine, M=Methionine. , K=Asparagine , 
P=Proline # O^Glutamine , R=Arginine, 
S=Serine, T= Threonine, v= Valine, 
W=Tryptophan, Y=Tyro6ine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide ;nsertion) 








AVAQNRW LH I YDNQG I ELHCI R FX DRVTRLEFLP FH FLLATASE 
TGFLTYLDVSVGKI VAALNARAC-K LDVMSQNPYNAVIHLGHSNG 
TVSLWS P AM K E P LAKI LCHRGG V HLA VAVDSTG T YMATSGLDHQL 
KIFDLRGTYOPLSTRTLPHGAGKLAFSORGLLVAGMGDWNIWA 
GOGXASPPSLEQPYLTHRLSGPVHGLQFCPFE0VLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIEKLGYDP0AKAPFQPKPKOKGR 
S S TASLV KR KR KVMDEEHRDK V K C S LQQQHH K E AKAK PTGAR PS 
ALDRFVR 


6254 


15t 


1139 


HALGRRGGSOSLSAAACGCFALKLRAPGSGRPALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLKEYR^CMPLTVDEYKIGQLYMISKH 
SHEQSDRGEGVEWQNEPFEDF M 1 IGNGQFTEKR VYLNS KLPSWA 
RAWPKI FYVTEKAWNYYPYTJ 7EYTCS FLPKFS I HI ET KYEDN 
KGSNDTI FDNEAKDVEREVCFI E 1 ACDEI PERYYKESEDPKHFK 
SEKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 
KKWRDI LLI GHRQA FAWVDEW Y D MTMDDVRE Y E KNMKEQTN I K 
VCKQHSS PVDD I ESHAQTS T 


6255 


3 


1444 


PTRPQQELLVSLATVI FVASQK7-.LS VESKAVI KQQLESVSNGWT 
VYR I ARQASRMGNHDMAKELYOS LLTQVASKH FYFWLNS LKEFS 
HAEOCLTGLQ E ENYS S ALSCI AE5 LKFYHKGI AS LTAASTPLNP 
LSFQCEFVKLR I DLLQAFSQL3 CT CNSLKTSP P PAI ATT I AMTL 
GNDLORCGRISNOMKQSMEEFRSL.t.SRYGDLYOASFDADSATLR 
NVELQOOSCLLISHAIEALILDPESASFQEYGSTGTAHADSEYE 
RRMMSVYNHVIjEEVESLNGKYTPVSYMHTACLCNAIIALLKVPL 
SFQRYFFQKLQSTS I KLALSPS PK.K PAEPIAVONNQQLALKVEG 
WQHGSKPGLFRKIQSVCLNVSS7L0SKSGODYKIPIDNMTNEM 
EQRVEPHNDYFSTQFLLNFAILG7 i-NITVESSVKDANGlVWKTG 
PRTTIFVKSLEDPYSQQIWjCX20CAOQPW3WORNAYTRF 


6256 


1 


1542 


CRGAGAEPAANPRSPRSLVPSLE5 3 STSVPPAPGTMATDSWAJUA 
VDEQEAAAESLSNLHLKEEKIKFDTNGAWKTNANAEKTDEEEK 
EDRAAQSLLNKLIRSNLVDNTNOVrVLQRDPNSPLYSVKSFEEL 
RLKPQLI^G"VYAMGF^RPSKIOEWvLPLMlAEPP0NLIA0SOSG 
TGKTAAFVI^AMLSOVEPANKYPOCLCLSPTYELALQTGKVIEQM 
GKFYPELKLAYAVRGNKLERGQK3 SEQIVIGTPGTVLDWCSKIiX 
FIDPKKIKVFVLDEADVMIATQGKQDOSIRIQRMLPRNCOMLLF 
SATFEDSVWKFAQKWPDPNVI>iKREEETLDTlKOYYVIiCSSR 
DEKFOALCNLYGAI TI AQAM I FCHTR KTASWLAAELSKEGHQVA 
LLSG EMM V EQRAAV I ERFREGKE K VLVTTNVCARG I DVEQVS VV 
INFDLPVDKDGNPDNETYLHRIGK TGR FGKRGLAVNMVDSKHSM 

nilnriqehfnkkierldtddlde: EKIAN 


6257 


210 


615 


AFIPAMAELIOKKLOGEVEKYQOLCPXlLSKSMSGROKLEAOLTE 
mi VKEELALLDGSNWFKLLG PVLVKQELGEARATVGKRLD Y I 
TAEIKRYESOLRDLERQSEQQRETLAOLQQEFQRAQAAKAGAPG 
KA 


6256 


210 


615 


AFIPAMAELIOKKLQGEVEKYQOLOKDLSKSMSGR0KLEAQLTE 
NN I VKE E LALLDGS NWFKLLG P V L VKQELG E ARATVGKRLD Y I 
TAEIKRYESQLRDLERQSEQQRETLAQLQQEFORAOAAKAGAPG 
KA 


6259 


2 


1540 


ILEKGFPSQCHPERKWKVDDVLESSOENEDDHFWELLFHNNKTV 
SVENGDRGSKTFNLGTDPVSLRNYFYK1CDSCEMNLKNISGLII 
SKKNCSRKKPDEFNVCEKLLLDIR>uiKIPIGEKSYKYDQKRNAI 
NYHODLSQPSFGOSFEYSKNGOGFHEEAAFFTNKRSQIGETVCK 
YNECGRTFI ESLKLNISQRPHLEMEPYGCS ICGKS FCMNLRFGH 
QRALTKDN P YE YNE YGE I FCDN S A F 1 1 1 IQGA Y TRKILREYKVSD 
KTWEKSALLKHQIVHMGGKSYDYIvENGSNFSKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHOKTHTGEKPYECTECGKAFCQ 
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BNSDOCIO: <WO_0153312A1_I_> 



WO 01/53312 



]>CTYlJS<)0/34263 



SEC 
ID 
NO: 


Predicted 

V"»#*rr i tin i nn 
ucy iiuji^jy 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuc 3. eot ict 
location 
corresponding 
to first 

residue oi 
amino acic 
sequence 


! Amino acid segment containing signal peptide 
1 (A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
! Glutamic Acid, F=?henylalanine , G=Glycine, 
K=Histidine, I=Isoleucine, Lysine, 
L= Leucine, K=Methionine , N^Asparagine, 

S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








Y E CNACG KS FCHRS ALT VHQRTHTGE KP F j CNECGKS F CVXSNL 
IVHQRTHTGEKPyKCNECGKTFCEKSALTKHORTHTGEKPYECN 
ACGKTFSQRS VLTKHQR I HTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMALRELKVCLLGDTGVGKSSIVW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMYYRGSAAAIIVYDITKEETFSTLKNWVKELRQHGPPNI 
WA I AGNKCDLIDVREVMERDAKD YADS I KA I FVETSAKNAI NI 
NELFIEISRRIPSTDANLPSGGKGFKLRROPSEPKRSCC 


6261 




1188 


FW y RLG PGTR S R WPRRG S W AAS L V P RG P S P AALVTS PC P P DPLR 
SPACEPCRPDFAPRPALLLRSGPRSAPAVTGKPALKGQPGPWPG 
KAEVS IDQS KLPGVKEVCRDFAVLEDHTL/'JiSLQEQEI EHHLAS 
NVORNRLVQHDLQVAKOLQEEDLKAQAQLQKRYKDLEQQDCEIA 
QE 1 0EKLA I EAERRR1 QEK KDED I ARLLCE KELQEEKKRKXHFP 
E F P ATRA Y ADS Y YYEDGGMKPR VM KEAVS T FS RMAHRDQ E WYDA 
EI ARKLQEEELLATQVDMRAAQVAQDEE I/iRLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKESDEPHHSKNERPA 
RPPPPI MTDGEDADYTH FTNQQS S TRHFS KS ES SHKGFHYKH 


6262 




1759 


FECHSOGLCSVHRPGKVPOARMSGLVLGORDEPAGHRLSOEKXJb 
GS TR L VSOG L E ALRS EHQA V LQS 1 • S QT I ECLQQGG H EEGLVHEK 
ARQLRRSMENIELGLSEAOVMLALASHLSTVESEKQKLRAQVRR 
LCQENQWLRDELAGTQQR LQRSEQAVAQLE E EKKHLEFLGQLRQ 
YDEDGHTSBEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGArA 
AQOGG Y E I PAR LRTLHNLV I QYAAQGR YEVAVPLCKQALEDLER 
TS GRGH PD VATMLNI LALVY RDQN KYK EAAKLLNDALS I RES TL 
GPDHPAVAATLNNIiAVLYGKRGKYKEAEPLCORALEIREKVLGT 
NFPDVAKOlxNNLAIiLCCNOGKYEAVERYYCRALAIYEGQLGPDN 
FNVARTKNNLASCYLKOGKYAEAETbYKEI LTRAHVQEFGSVDD 
DH K P I WMHAEE REEMS K S R H HEGGTP YAE Y GG W YKACKVS S PT V 
NTTLRNLGALYRRQGKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


2408 


RELDS LADLPERI KPP YANGLSTSHLRSSSVEDVKLI I SEGRPT 
IEVRRCSMPSVICEHTKOFOTISEESNOGSLLTVPGDTSPSPKP 
EVFSNVPERDLSNVSN3 HS SFATS PTGASN SKYV S ADRNL I KNT 
APVWTVMDSPVHLEPSSQVGV1QNKSWEMPVDRLETLSTRDFIC 
PNSNIPDOESSLQSFCNSENKVLKENADFLSLRQTELPGNSCAQ 
DPAS FMPPQQPCSFPSQSLSDAES 1 S KHMSLS YVANQEPG I LQQ 
KNAVQ 1 1 SSALDTDNESTKDTENTFVLGDVOKTDAFVPVYS DST 
1 0EAS PNFEKAYTLPVLFSEKDFNGSDASTOLNTHYAFSKLTYK 
SSSGHEVENSTTDTQVISKEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNEQDPKHCPES EKCLLSIEDEESOQS I LSSLENHSQ 

NOSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQPVQVSPSLLQAKEKTQOSLAAIVDSLKLDEIQPYSSER 
ANPYFEYLHIRKKIEEKRKLLCSVIPQAPOYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFROOEWRMKLRLQHSIE 
REKLIVSNEQEVLRVHYRAARTLAWQTLPFSACTVLLDABVYNV 
P LDSQ SDDS KTS VRDRFNARO FMS WLQD VD DKFD KLKTCLLMRQ 
OHEAAALNAVQRLEWQLKLOELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQE2JNALDMAPEIKNTGPMCLIENTNGELVANPEALKILSAI 
TQPWWAIVGLYRTGKSYLi'lNKLAGKNKGFSLGSTVKSHTKGI 
WMWCVPHPKKPEHTLVLLDTEGLGDVKKGDNQNDSW1FTLAVLL 
SSTLVYNSMGTINQQAMDOLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
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BNSDOCID: <WO_0153312A1 J_> 



WOO J/5331 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


Frecictec 
bee i nning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predictec end 

nnrl pflt* 1 rif- 
ilUvlw^liUC 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Kistidine, 3=Isoleucine , K=Lysine, 
L=beucine, M=Methionine, N^Asparagine, 
P^Proline, O-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine , 
W=Tryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELDPEFVQQVADFCS Y I FSNS KTKTLSGG I KVNGPRLESLVLTY 
I NA I SRGDLPCMENAVLALAQ 1 ENSAAVQKA I AHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNSFKDVDHLFQKKLAAOL 

dkkrddfckqnqeassdrcsallqvifspleeevkagiyskpgg 
yclfiqklqdlekkyyeeprkgiqaeeilqtylkskesvtdail 
otdq i lte kekei evec vkaesaqas akmveemqi kyqqmmeek 
eksycehvkqltekmereraqlleecextltsklqeqarvlker 
cqgestqlqneiqklqktlkkktkrymshklki 


6265 


143 


1960 


khrqennaldmapeihmtgpmclientngelvanpealkilsai 
topwwaivglyrtgksyiimnkliagknkgfslgstvkshtkgi 
wm w cvph p kkpehtlvlldteglgdvkkgdnqndsw i ftlavll 
sstlvynsmgtinqqamdqlyyvtelthrirsksspdenekeds 
ad f vs ffpd fvwtlrd fs ldle adgq pl t pde yle y s lkltqgt 

SQKDKNrNLPRLCIRKFFPKKKCFvFDLPIHRKK^ 
ELDPEFVQQVADFCS YI FSNS KTKTLSGG I KVNGPRLESLVLTY 
3NAISRGBLPCKENAVLALAQIENSAAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVS EREATEVYMKNS FKDVDHLFOKKLAAQL 
DKKRDDFCKQNOEASSDRCSALLQVIFSPLEEEVKAGIYSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
0TD0 1 LTE KEKEI EVE CVKAESAOASAKMVEEMQI KYQQMMEEK 
EK S YQEHVKQL»TE KM ER SRAQLLEEQEKTLTS KLQE QAR VLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


62 66 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQG IGLDGI PEVTASE 
G F T VNE I NKKS I H I S CP KE NAS S KFLAP YTT FS R IH TKS I TCLD 
ISSRGGLGVSSSTDGTMKIWQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKIWSAEDASCWTFKGHKGGILDTAIVDR 
GRK WSASRDGTARLWDCGRS ACLGVLADCGS S INGVAVGAADN 
bl JvJjV3l>Pfc.QMPobRbVbl fcAKMLLlaAREDKKLQCi/jLQSRQjjVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGA PVLSLLS VPJDG FI ASQGDGS CFI VQQDLD YVTELTGADCD 
PVYKVATWE KQI YTCCRDGLVRRYQLSDL 


6267 


3 


622 


LGMMKKNNSAKRGPQDGNQQPAPPEKVGWVRKFCGKGIFREIWK 
NRYWLKGDQLYISEKEVKDEKNIQEVFDLSDYEKCEELRKSKS 
RS KKNHS KFTLAKSKQPGNTAPNLI FLAVSPEEKES W INALNSA 
ITRAKKRILDEVTVEEDSYLAHPTRDRAK3QHSRRPPTRGHLMA 
VASTSTSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


1368 


HRELCQNLPAGLS SAL I DNPLTLLLS I DTYVMLQEPVTFQDVAV 
npccppwfnr/r c dtod t r v dtiumt PTrr:uT .uc vrmpttt vyjwx 

Lit o Kc, IL ITVVjijJLiOlr JL l^K X £> I KU vi v lj_it, i r l^rlljVo X XjEtli l\CtLje\ 

PNSDIPEEEPAPSLKVQESSRDCALSSTLEDTLQGGVQEVQDTV 
LKOMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 
SQANS GALDTNQVLLH KIP P RKRLR KRDS QVKS MKHNSR VKI HQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
IFRKPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 
GERPFECQECGRTFNDRSAISQKLRTHTGAKPYKCQDCGKAFRQ 
SSHLIRHQRTHTGERPYACNKCGKAFTQSSHLIGHQRTHNRTKR 
KKKQPTS 


6269 


2686 


1445 


HASAPTRJ^MAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL 
TQYVKWTNDKS1/5GIEGCLSKLW^ADPTFVMGHAMATGLVLIGT 
GSSVKLDKELDLAVKTMVEISRTQPLTRREQLHVSAVETFANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
I YP FWTPDIPLSS YVKG I YS FGLMETN FYDQAE KLAKEAI/S INP 
TDAVJSVHWAHIHEMKAEIKDGLEFMQHSETLWKDSDMLACHNY 
WKWALYLIEKGEYEAALTIYDTHILPSLQANTDAMLDVVDSCSML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQELLTTLRDASESPGENCQHLLARDVGIiPLCQALVEAE 
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BNSDOCID: <WO 01S3312A1 J_> 



WO 01/53312 



PCT/US00/34263 



SEC 
3D 

NO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, 3— Isoleucine, K-Lysine # 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»VaIine, 
WsTryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /=possibie nucleotide deletion, 
\=pcssible nucleotide insertion) j 








DGNPDRVLELLLPI RYRI VOLGGSNAORDVFNQLLI HAALNCTS 
S VHKNVARSLLMERDALKFNS PLTERLI RKAATVHLMQ 


6270 


23 


2066 


S VTVTLG S EGDGR P P T Y HLE EMEQEP QNG E PAEI K 1 1 REA Y KKA 
FLFVNKGLNTDELGOKEEAKNYYKQGIGHIiLRGISISSKESEHT 
GPGWESARQMQOKMKETLCKVRTRLEILEKGLATSliONDLQEVP 
KLYPEFPPKDMCEKLPEPOSFSSAPQKAEVNGNTSTPSAGAVAA 
P ASLS LP SQS CP AE AP ? AYTPQAAEGHY7VS YGTDSGEFSSVGE 
EFYRNKSQPPPLETLGLDADELILIPKGVQ1FFVNPAGEVSAPS 
YPGYLRIVRFLDNSLDTVLKRPPGFLQVCDWLYPLVPDRSPVLK 
CTAGAYMFPDTMLQAAGCFVGWLSSEX.PEDDRELFEDLLRQMS 
DLRL0ANWMRAEEENEF01FGRTRPSSDQLKEASGTDVKQLDQG 
NKDVRHKGKRGKRAXDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
S EXVAHNI LS GAS WVS WG LVKGAE 1 TGKAI QKGASKLRERI QPE 
EKPVEVSPAVTKGLYIAKOATGGAAKVSQFLVDGVCTVANCVGK 
ELAPHVKKHGSKLVPESLKKDKDGKSPLDGAMVVAASSVQGFST 
VWQGLECAAKCIVNNVSAETVOTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYNI NNIGI KAMVKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


105f 


GCGVKTAGMVGREKELSIHFVPGSCRLVEEEVNIPNRRVLVTGA 
TGLLGRAVHKEFQONNWHAVGCGFRRARPKFEQVNLLDSNAVHH 
1 1 IIDFQPHVI VHCAAERRPD WENQPDAASQLNVDASGNLAKEA 
AAVGAFLI Y I S SDYVFDGTNP PYREED1 PAPLNLYGKTKLDGb'K 
AVLENNLGAAVLRI P 1 LYG EVEKLEES AVTVMFDKVQFSNKS AN 
MDHWOQRFPTHVKDVATVCRQLAEKRMLDPSIKGrFHWSGNEQM 
TKYEMACAIADAFNLPSSHLRP1TDSPVLGAQRPRNAQLDCSKL 
ETLG I GQRTP FR I G I KE S LW P FL I DKRWRQT V FH 


6272 


1136 


52 6 


GAVMEDAAAPGRTBGVLERQGAPFAAGQGGALVELTPTPGGLAL 
VSPYHTHRAGDPLDLVA1AEQVQKADSFIRANATNKLTVIAEOI 
QHLQEQ ARKVLEDAHRE'ANLHHVACN I VKKPGT5 I Y YLY KR ESGQ 
QYFSIISPKEWGTSCPHDFLGAYKLQHDLSWTPYEDIEKQDAK3 
SMMDTLLSQSVALPPCTEPN FQGLTH 


6273 


256 


843 


SCPR VS PECRSLGCQVMFSLPLNCS PDHI RRGS CWGRPQDLKI A 
SAAWNSKCHPGAGAAMAROHARTLWYDRPRYVFMEFCVEDSTDV 
HVLI EDHRI VFS CKNADGVELYNE I E FYAKVNS KDSQDKRSSRS 
ITC F VR KWKE K VAW FRLTK ED I KF VWLS VDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGGAGAAR S LS R FRGCLAG ALLGDCVGS F YEAHDT 
VDLTSVLRHVQSLEPDPGTPGSERTEALYYTDDTAMARALVQSL 
IAKEAroEVDMAHRFAQEYKKDPDRGYGAGVVTVFKKLLNPKCR 
DVFEPARAQFNGKGSYGNGG/^MRVAGISLAYSSVQDVQKFARLS 
AQLTHAS S LG YNG A I LC ALAVHIxALOGES SSKHFLXQLLGHMED 

ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSLQRTLIYSI 
SLGGDTDTIATMAGAIAGAYYGMDQVPESWQQSCEGYEETDILA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVFRPAICTMAFMVKTMVGGQLKNLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 
KAERATLRSHFPJDKYRLPKNETDESQ I QMAGGDVELPRELAKM I 
EEDTEEEEEKASVLGOLASLPGLNLGSLKDKAOATLGDLKOSAE 
KCHVM 


6276 


191 


31 


TLLPLP P L PDTEGM I LLNTGLEGTVAENP VP I VHTPS GN I LTLE 
SCLQQLATHPGHWG I HLQIAEPAALRPSLALLARLSSLGLLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
LGSGYREQLLTDMLELCOGLWOPVSFQNQAMLLGHSTAGAIGRL 
LASSPRATVTVEHNPAGGDyASVRTALLAARAVDRTRVYYRLPO 
GYHKDLZjAHVGRN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucieotioc 
location 
corresponding 
to first 
atr.ino acic 
residue oi 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D^Aspartic Acid, E« 
Glutamic Acic, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTEMGLyYSYFKTIVEAPSFLNGVWMIMNDKLTEYPLVINT 
LKRFNLYPEVIliASWYRIYTKIMDLIGIQTKICWTVTlGEG!.»SF 
TESCEGLGDPACFYVAV1 FI LNGLMMALFFI YGTYLSGSRLGGL 
VTVLCFFFNHGECTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
TKLYRGSLIALCISNVFFMLPWQFAQFVLLTQIASLFAVYWGY 
ID I CKLRKI 2 Y I HM 1 S LALC FVLKFGNSMLLTS Y YASS LV 1 1 WG 
ILAMKPHFLKINVSELSLWV10GCFWLFGTVILKYLTSKIFGIA 
NDAHIGNLLTSKFFSYKDFDTLLYTCAAEFDFMEK3TFLRYTKT 
LLLPWLVGFVAIVRKI 2 SDMWGVLAKQQTHVRKHQFDHGELVY 
HALQLLAYTALG I L I MRLKLFLrPKMCVMASLJ CSRQLFG WL FC 
KVHPGA I VFAI LAAMS I OGS ANtOTQWNI VGEFSNLPQEEL I EW 
IKYSTKPDAVFAGAMPTMASVKLSALRPIVNHPHYEDAGLRART 
KIVYSMYSRKAAEEVKRELIKLKVNYYILEESWCVRRSKPGCSM 
PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVLEW 
KE 


6278 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVLIVEX3QQYAWGTVLLL 
IRII LEYCOGVDN I ? 5VTTDMLTRLS DLLKYFKSRS CQLVLGAG 
ALQVVGLKTITTKNLALSSRCLOLIVHYIPVIRAHFEARLPPKQ 
Y 5MLRK FDH I T K D Y H DH I AE I S AKL VA I MDSLFDKLLS KYE VKA 
P VPS AC FRN I CKQMT KMHE A I FDLLP E EQTQMLFLR I NAS Y KLH 
LKKQLSHLNVI NDGG PONGLVTADVAFYTGNLQAiKGLKDLDLN 
MAEIWEQXR 


6279 


127 


1687 


GGAMASDGARKOFWKRSNSKLPGSIQHVYGAOHPPFDPLIjHGTL. 

lrstakmpttpvkakrvstfoefesntsdawdagedddellama 
aeslksewmetanrvlrnhsorqgrptlqegpglqoxprpeae 
ppsppsgdlrlvksvseshtscpaesasdaaplorsoslpksat 
vtlggtsdpstlsssalsereasrldkfkqllagpntdleelrr 
lswsgipkpvrpmtwkllsgylpanvdrrpatlqrkqkeyfafi 
ehyydsrndevhodtyrqihidi prmspealilqpkvtei fer: 
lfiwairhpasgyvqgindlvtpffwficeyieaeevdtvdvs 
gvpaevlcnieadtywcmsklldgiqdnytfaqpgiomkvkmle 
elvsr I deqvhrkldqhevr y lqfafrwmnnllmrevplrct I R 
lwdtyosepdgfshfhlyvcaaflvrwrkeileekdfqelllfl 
qnlptahwdded1slllaeayrlkfafadapnhykk 


6280 


857 


2515 


eccdqkmgsrnsssagsgsgdpseglprrgaglrrseeeeeede 
dvdlacvlayllrrgqvrlvqgggaanlqfiqalldseeendra 
wdgrlgdrynppvdatpdtrelefneiktqvelatgqlglrraa 
qkhsfprmlhqrerglchrgsfslgeqsrvishflpndlgftds 
ysqkafcg i yskdgqi fmsacqdqti rlydcrygrfrkfxs i ka 
rdvgwsvldvaftpdgnhflysswsdyihicniygegdthtald 
lrpderrfavfs i avs sdgr evlggandgclyvfdreqnrrtlq 
i esheddvnavafad i s sqi lfsggddajckvwdrrtmreddpk 
pvgalaghqdgitfidskgdarylisnskdqtixlwdirrfssr 
egm easrqaatqonwdyrwqqvp kkawrkl klpgds s lmtyrgh 
gvlhtlircrfspihstgqqf1ysgcstgkwvydllsghivkk 
ltjihkacvrdvswhpfeekivssswdgnlrlwqyrqaeyfqddm 
peseecasapapvpqs stp fs spq 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGOVRLVOGGGAANLQFIQALLDSEEENDRA 
WDGRLGDRYNPPVDATPDTRSLEFNE I KTQVELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YSKDGQI FMSACQDQT I RLYDCRYGRFRKFKS I KA 
RDVG W S VLDVAFTPDGNH FLY SSWSDYIHICKI YGEGDTHTALD 
LRPDERRFAVFS I AVSSDGR E VLGGANDGCLYVFDR EQNRRTLQ 
I ES HE DDVNAV A FAD I SS Q I L FSGGDDAI CKVWDRR TMR EDD P K 
PVGALAGHQDG I TFI DSKGDAR YLI SNS KDQTI KLWDI RRFSSR 
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SEQ 
~D 
NO: 


Predictec 

Ucy iiijii lie 

nucleotide 

location 

corresponding 

amino acid 
residue of 
amino acic 
sequence 


Predictea end 
nucleot ide 
location 
corresponding 
to first 
canij.no aciu 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D-Aspartdc Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycme, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
r-rioiinc , u— v»j. u taTiiine , K=Hrginine / 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRCAATO0KWDYRW0QVPKKAWRKLKLPGDSSLMTYRGH 

f~\jx iifT TurDf CDTucTrnoui vcrr"CTrinnn7vnt t cr*jT \ 7 v t/ 
vj vjjnl Jjlr\*-ivr oJrj.no l\j\j\Jr 1 lobtbi Vjjvv v v iJjJuJjot^rii VXK 

LTNHKACVRDVSWHPFEEKIVSSSWDGNXRLWQYRQAEYFODDK 

PESEECASAPAPVPQSSTPFSSPQ 


6282 


12S 


906 


RMAACRALKAVLVDLSGTLHI EDAAVPGAQEALKRLRGAS VI IR 
FVTNTTKES KQDLLERLR KLE FDI SEDE 1 FTSLiTAARS LLERKQ 
VRPMLLVDDRALPDFKGIQTSDPKAWMGLAPEHFHYO I LNQAF 
RLLLDGAPLIAIHKARYYKRKDGLALGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
G I LVKTGKY RAS DEEK I NP PPYIiTCES F PHAVDHI LQHLL 


6283 


140 


1043 


LSLFGIHVMNPFWSNSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPS I PKNALPI TKPTS PAPAAQSTNGTHAS YGPFYLEYSLL 
AEFTLWKOKLPGVYVQPSYRSALMWFGVIFIRHGLYODGVFKF 
TVYI PDNYPDGDCPRLVFDI PVFHPLVDPTSGELDVKRAFAKWR 
RNHNHI WQVLMYARJRVFYK I DTAS PLN P EAAVLYEKD1 QLFKSK 
WDSVKVCTARLFDQPKIEDPYAISFSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


1 


2879 


RSVIPGST1SSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQ VERENVOKRTFTRWI NLH bEKCNPPLEVKDLF VD1 QDGKI 
LrALLEVLSGRNLLHEYKSSSHRIFRKWIAKAiKFT.EDSiWKL 
VSIDAAEIADGNPSLVLGLIWNIILFFOIKELTGNLSRNSP.SSS 
LAPGSGGTDSDSSFPPTPTAERSVAISVKDQRKAIKAL1AW0R 
KTRKYGVAVODFAGSWRSGLAFLAVIKAIDPSLVDMKCALENST 
RENLEKAFSIAODALHIPRLLEPEDIMVDTPDEQSIMTYVAQFL 
ERFPELEAEDIFDSDKEVPIESTFVRIKETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPS KVF VCDKPESM KEFRLDGVSSHALS DS 
STEFMHQI I DQVLOGGPGKTSD I SEPS PESS1 LS SRKENGRSNS 
LPIKKTVHFEADTYKDPFCSKNJ^SLCFEGSPRVAKESLRODGHV 
LAVEVAEEKEOKOESSKIPESSSDKVAGDIFLVEGTNNKSOSSS 
CNGALESTARHDEESHSLSPPGEWTVMADSFQIKVNLMTVEALE 
EGD Y FE AI PLKAS KFNSDL I DFASTS QAFNKVPS PHETKP D E DA 
EAFENHAEKLGIO^SIKSAMKj^SPEPQVKMDKHEPHODSGEEA 
EGCFSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLOGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDIiKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVLYJjSRGGVGTTPASEPAP 
i^HEDHWRETKENDPMDSHOSOESPNLENIANPLEENVTKES 
ISSKKKEKRKHVDjWESSLFVAPGSVQSSDD I PSR 
TSHSDSSIYLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


scktenllsmwwfqqglsflpsalviwtsaafifsyitavtlhii 
idpalpyisdtgtvapekclfgamlniaavlciatiyvrykqvh 
als peenvi i klnkaglvlgilsclgls i vanfqkttlfaahvs 
gavltfgmgslymfvqtilsyqmqpkihgkqvfwirlllviwcg 
vsalsmltcssvlhsgnfgtdleqklhwnpedkgyvlhm1ttaa 
EWSMSFS FFGFFLT YI RDFQKI slrveanlhgltlydtapcp IN 
NERTRLLSRDI 


6286 


1619 


276 


KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
P FS I P AAS E I ADLSN 1 1 NKLLKDKNE FH KHVEFD FL I KGQFLRM 
PLDKHMEMEN I SS EE WEI E Y VEKYTAPQPEQCMFHDDW I SS I K 
GAEEWILTGSYDKTSRIWSLEGKSIMTIVGHTDWKDVAWVKKD 
SLSCLLLSASMDQTILLWEWKVERNKVKALHCCRGHAGSVDSIA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRKKQKT 
EQLGLTRTPI VTLSGHMEAVSS VLWSDAEE I CSASWDHTI RVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLVJDPRT 
KDGSLVSLSLTSHTGVrVTSVKWSPTHEQQLISGSLDNIVKbWDT 
RSCKAPLYDLAAHEDKVLSVDWTDTGLLLSGGADNKLYSYRYSP 
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SEO 
ID 

Nv " 


Predicted 
beginning 
nucleotide, 
location 
corresponding 
to first 
amino acic 
residue oi 
amino acid 
sequence 


Predicted end 
nu cl e o t i de 
locafc ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Asp£rtic Acid, E- 
Glutaitxic Acid, F=Phenylalenine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagin£, 
?=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unxnown, *=Stop 
Codon, /^possible nucleotide deletion, 
N^possible nucleotide insertion'; 








TTSHVGA 


6287 


27-8 


1482 


MQFFFNFOIGLRSTSGKEKYSGDAGFLGDAL-CLFLOCLALDEDF 
APAKLQVOKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 
WiEES0SLN2PSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEODVIVNEDGR 
NKuKKOGETP.NTEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 
PVTTPCGHSFCKNCLERCLDHAPYCPLCKESLKEYLADRRYCVT 
OLbEELIVKYLPDELSERKKlYDEETAELSKLTKNVpIFVCTMA 
YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 
GCMLQIRNVHFLPDGRSVVDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6288 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDLAJs'LGAECLR 

MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGODPFFDM 

H MM VS K P EQW V KPMA VAG ANQ Y TFHLE ATEN PGALI KD I RENGM • 

KVGbAIKPGTSVEYLAPWANQIDMALVMTVEPGFGGOKFMEDMM 

PKVHWLRTQFPSLD1EVDGGVGPDTVHKCAEAGANM3VSGSAIM 

RSEDPRSV1NLLRKVCSSAAQKRSLDR 


62BS 


3 


743 


VTLY PCRGLVGNLLLGASGMASGCKI GPS I LN SDLAKLGAEC LR 
MLDSGADYLHLDVMDGHFVPNITFGHFWESLRKOLGODPFFDM 
HMMVS KP EQWV KPMA VAG ANQ YTFHLEATEN PGALI KD 1 RRNGM 
KVGLAIKPGTSVEYLAPWANOIDMALVMTVKPtiFGGOKFMEDMM 
PKVHWLRTOFPSLD2EVDGGVGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSV1NLL.RNVCSEAAQKRSLDR 


6290 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRRRRM3ERYTRKA 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSOSKS 
DJTRESSFTSADTGNSLSAFPSYTGAGISTEGSSDFSWGYGELD 
GNATEKVQTMFTAIDELLYEQKLSVHTKSLOEECQQWTASFPHL 
RI LGRQI ITPSEGYRLYPRSPSAVSASYETTLSQERDSTI FGIR 
GKKLHFS SS YAHKASS IAKSSSFCSMERDEEDS I IVSEG IIEEY 
LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCMKEDVLAY 
V FDS VWCKWS CMEQLTRSH WEGFASDDESNVAVTR PDS ESS CV 
LSELHPLVLPRVPQSKVLYITSNPMSbCQASRKQPNVKDLLVHG 
WPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLSYTVQSTRRRNPPFRTLHP1STSKSCAETPRSVEEILRGA 
RVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVEHVSTVGPQRQMK 
PHGDSSRAQSAWDEPNYOQPQERLLLPDFFPRPNTTOSFLLDT 
OYRRS CAVEY PHQARPGRGSAGPQLHGSTKSQSGGR PVS RTROG 
F 


6291 


1732 


602 


LV AKMAS S AS ARTPAG KR V I NQEELRRLMKE KQRLiS TSK KR 1 ES 
P FAKYNRLGQLS CALCNTPVKSELLWQTHVLG KQHREKVAELKG 
AKEASQGSSASSAPQSVKRKAPDADDODVKRAKATLVPOVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLFNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 

3 GEIDEQI SCYRRVEKLRNRQDEI KNKLKEI LTJ XELQKKEEEN 
ADS DDEGELQDLLSQDWRVKGALL 


" 6292 


1835 


1142 


tcpgamkmvapwtrfysnscclcchvrtgtillcvwyli inavv 
LLILLSALADPDQYNFSSSEJjGGDFEFMDDANI'jCIAIAISLLMI 
L3 CAMAT Y G AY KQRAAW I IPFFCYQI FD FALNMLVAI TVLIYPN 
S I QEY IRQLPPNFP YRDDVMS VNPTCLVbl I LI.Fl S 1 1 LTFKGY 
L I S CVWNCYRY INGRNSSDVLVYVTSNDTTVLLP P YDDATVNGA 
AKEPPPPYVSA 


6293 


2382 j 1035 

i 


FWCTLGTVDVHPIGV7CAINSKILVPPRTIHAKFTDWKGYLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLEWDKSOVSRTRMAW 
DTVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
locatior. 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyialanine, G=Glycine, 
H=Histidine, I=Isoieucine, K=Lysine, 
L=Leucine, M=Methionine, N=Aspar3gine, 
P=Proline, 0=Glutamine, RsArginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine , X=Unknown, **=Stop 
Codon, /=possible nucleotide deletion, 
\-poosible nucleotide insertion) 








MS ERR S DMAKH PT FR KI Y C D AV P Y b F KK VRAVYT EGGW FEEGM K 
LEAIDPbNLGNlCVATVCKVLLDGYLMICVDGGPSTDGLDWFCY 
HASSHAI F PAT FCO KND 1 E LT P P KG Y E AQT FNK EN YLEKTKS KA 
APSRLFNMDCPNHG FKVGMKLEAVDLMEPRLI CVATVKRWHR L 
LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKHIPPTKTRFLROGSKKPLLEDD 
PQGAKKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NI KQETDD 


6294 


354 


1814 


AOI^TTRGRTVAGGVKWIPSPFPDLELYSCCLGTDRGFPELSHHC 
KNV I ATAS DYDMAE I TNI R PS FDVS P WAGLI GASVLWCVS VT 
VFVWSCCHQOAEKKHKNPF YKF1 HMLKG I SI YPETLSNKKKI I K 
VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDO 
LPIKMDYGEELR5PITSLTPGESKTTSPSSPEEDVMLGSLTFSV 
DYNFPKKAbWTIOEAHGbPVMDDQTOGSDPYIKMTILPDKRHR 
VKTRVbRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 
RDDVIGEVMVPLAGVDPSTGKVQLTRDI IKRNIQKCISRGELQV 
SLS YQPVAQRMT WVLKAR HLQKMDI AGLSGNP YVKVNVYYGRK 
R 3 A K K KTHVK KCTLN P I FNES F I YD I PTDLL PD I SI EFL V I DFD 
RTTKNEWGRLILGAHSVTASGAEHWREVCESPRKPVAKraSLS 
EY 


6295 


2795 


617 


VS S ALLTGATSGSDAAKS EGAS AS PLS CTNAVAMDRPDEGPPAK 
TRRLSSSESPQRDPPPPPPPPPLLRLPLPPPOORPRLQEETEAA 
QVLADMRG VG LG F ALPP P P PY V I LEEGG3 RAY FTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 
GALETCS AVGWAPQRLVDP KS KEE A 1 1 1 VEDEDEDERESMRSSR 
RRRRRRRRKORKVKRESRERNAERMESILQALEDIQLDLEAVN1 
KAGKAFLRLKRKFIOMRRPFLERRDLiI IQHI PGFWVKAFLNHPR 
I S I L I NRRD ED I FR Y LTNL QVQDLRH I SMGYKM KL YFQTN PYFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGOEPOARRHGNODASHS 
FFS WFSNHSLPEADR I AE 1 1 KNDbWVNPLRYY LR ERGSRI KRKK 
QEMKKRKTRGRCEVVIMEDAPDYYAVEDIFSEISDIDETIHDIK 
I SDFMETTDYFETTDNEI TDINEN1 CDSENPDHNEVPNNETTDN 
N^SADDHETTDNNESADDK^ENFEDIWKK^DNEENPNNNEWTY 
GNNFFKGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRD1 EYYEKVI EDFDKDQADYEDVI EI I 
SDESVEEEGIEEG 1 QQDEDI YEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNG WAN PGKRG KTG 


6296 


727 


1199 


RKCGCD AQG ACDS LP FTGTS SPVTARN A I PEARCCVWLLDGTTV 
EA VRPAR ER LAR KELRQKRMQQFS RDS A YS S NKDS TCL LTBRDT 
LGTSLQ F PS PFSGT I S FG S FS DS G I F PLGS QCCLGFQQ FS I SG K 
KWALIHKRVRLSVFGARWGR I YFGK 


6297 


1 


922 


QRTW^PSSCGPRGAEYGAiMAMEGYWRFLALLGSAbbVGFLS 
VI FALVWVbHYREGLGWDGSALEFNWHPV^MVTGFVFIQGIAI I 
V YRbP WTW KCS KLLM KS I HAGLNA VAAI LAI I S WAVFENHNVN 
NIANMYSLHSWVGLIAVICybLObbSGFSVFLLPKAPLSLRAFL 
MPIHVYSG I VI FGTV I ATALMGLTEKL I FSLRDPAYSTFPPEGV 
FVNTLGLL I LVFGAL 1 FW 1 VTR PQWKRPKEPN ST3 LHPNGGTEQ 


6298 


3 


98B 


SVPLRRI^LSGTbOGAGTTTKMAVARbAAVAAWVPCRSWGWAAV 
P FG PHRGLS VLLAR 1 P QRA PR WLPACRQKTSLS FLNR PDLPNLA 
YKKLKGKSPGI IF1 FGYLSYMNGTKALiAI EEFCKSLGHACIRFD 
YSGVGSSDGNSEESTLGKWRKDVLSI IDDLADGPQI LVGSSLGG 
WLMLHAAIARPEKVVALIGVATAADTLVTKFNQLPVELKKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLL14SPIPVNCPIRL 
LHGMKDD I VPWHTS M Q VADRVLS TD VDVI LR KKSDHRMREKADI 
OLLVYTIDDLIDKLSTIVN 
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SEO 
ID 
NO: 


predicted 
beginning \ 
nucleotide 
location 
corresponding 
to first 
aTnino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
} oca t ion 
corresponding 
to first 
amino acic 
residue o.' 
amino acic 
sequence 


Anuno acid segment containing signal peptide 
{A«Alar.ine, C*Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan / Y^Tyrosine, X=Un known, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


629S 


512 


814 i 

] 


ECDbEGIMPNVTISLSLPTNGSPLODILVHPCVTSLDSAILTSS 
S 3 DAMDCSAFSGPYKFPFTPPLESFNLCFYTSQVPVPPILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AA P S C W SQRG VP AAGTP S S PRLLVS RAAAP S AG ? WG AW RQG ARA 
AOSPFSIPNSSSVPYGSODSVHSSPEDGGGGRDRPVGGSPGGPR 
LV1GSLPAKLSPHMFGGFKCPVCSKFVSSDEMDLHLVWCLTKPR 
ITyNEDVLSKDAGECAICLEELQOGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


264 


GKFVPVNWEPPQPLFFPKYLRCYRCLLETKELGCLLGSDICLTP 
AGS S C I TLH KKN S SGSDVMVSD CR S KSQMS DCS NTRTS P VSG FW 
1 FSQYCFLDFCNDPQNRGLYTP 


6302 


49C 




IFGFLHLFHMEHSFLIiVCALFAHVFFSSSCGSSVALHSDPCLLS 
PVLLNCLPGDLRPLDELY/iQKLKYKAISEELDHALNDMTSL 


6303 


2 


196] 


YWNEYGGGLLWOSWQEKHPGQALSSEPWNFPDTKEEWEQHYSOL 
YWYYLEQFQYWEAOGWTFDASOSCDTDTYTSKTEADDKNDEKCM 
KVDLVSFLSSPIMGDNDSSGTSDKDHSEILDGISNIKLNSEEVT 
OSOLDSCTSKDGHQQLSEVSSKRECPASGOSEPRNGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PSKLKRSHELDIDENPASDFDDSGSLLGFKYGSGQKYGGIPNFS 
HROVR YLEKKVKLKSKYLDMRROI KMKNKHI FFTKESEKPFFKK 
SKILSKVEKFLTOVNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCOSVSSAGELETENYERDSLLATV 
PDEODCVTQEVPDSRQAETEAEVKKKKNKKKKKKVNGLPPEIAA 
VPEbAKYWAQRYRLFSRFDDGIKLDREGWFSVTPEKIAEHIAGR 
VSOS FKCDWVDAFCGVGGNT I QFALTGMRV1 AIDI DPVKlAIiA 
RNNAEVYGIADKIEF1CGDFLLLASFLKADWFLSPPWGGPDYA 
TAET FD IRTMMS PDG FEI FRLSKX2 TNNI VY FLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


3 


1436 


HRARXDRSRESPGGDLRHPGRVRRDlTLSGHPRLSrQHVVLLRE 
DEVGDPGTKDLGHPOHGS P I QET0SEWT1.VS PbPGSDMAALPA 
W RATS G LTL WP HTAEGRDLLG AE WRALTGGQOAED PTLASG AYQ 
WPGSVEKLQGSWJCDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 
AQGEWDKARVPAHGQVLOVGFSTEAALQDLSSPRLSQLCSQGL 
CGLI KR PGDLPE VLS FHVDRVLGLRRS LPAVARRFHS PLLPYRY 
TDGGARPV I WWAPDVQHLSDPDSDQNS LALGWLQYQALLAH S CN 
WPGOAPCPGIHKTEWARLALFDFLLQVHDRLDRYCCGFEPEPSD 
PCVEERLREKCRNP AELRLVH ILVRS SDPSHLVY1 DNAGNLQH P 
EDKLNFRLLEGIIX5F?ESAVK^3 J ASGCLQNML] J KSLQMDPVFWE 
SOGGAQGLKQVLQTLEQRGQVLLGHl QKHNLTLFRDEDP 


6305 




420 


NMIWRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPQQEEPPTES 
FDPAPGQEREEDQGAAETCVPDLEADLQELSOSKTGDECGDGPD 
VQGKILTKSEQFKMPEGR 


6306 


1 


1874 


PTRPSKVKVPHTFL IHS YT RPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEAL1NGDVPMEEATDFSEADKSALMD 
ESEDSGVI P6SHSENALHASEEEEGEGGKA0SSLGYI PLMRWQ 
S VRHTT RKS STTLREGWWHYSN KDTLR KRHY W RLDCKC I TLFQ 
NNTTNRYYKEIPLSEILTVESAONFSLVPPGTNPHCFEIVTANA 
TY F\/GEMPGGTPGGPSGQG AEAARGWETAI RQALMPVILQDAPS 
APGHAPHRQAS LS J SVSXSQl QENVDIATVYQI FPDEVLGSGQF 
GWYGGKHRKTGRDVAVKVIDKLRFPTKQESOLRNEVAILQSLR 
HPGIVNLECMFETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 
TKFLITQ3LVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFAR 1 1 GEKS FRRS WGTPAYLAPEVLLNQG YNRS LDMWS VG 
VIMYVSLSGTFPFNEDED1NDQIQNAAFMYPASPWSHISAGAID 
LIIWL.LQVKMRKRYSVDKSLSHPWLQEYQTV7LDLRELEGKMGER 
YITKESDDARWEQFAAEHPLPGSGLPTDRDLGGACPPQDHDMQG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


' Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylaianine, G=Glycine, 
H=Hiatidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LAERISVL 


6307 


2136 


585 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKVVRQSKFRHVFG 
QPVKNDOCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFL 
VIiPIjS KTGRIDKAYPTVCGHTGPVLDIDKCPHNDEVIASGSEDC 
TVMVWQI PENGLTS PLTEP WVLEGHTKRVGI I AWHPTARNVLL 
SAGCDNWIjI^WVGTAEELYRLDSLHPDLIYNVSWNKNGSLFCS 

ackdksvri idprrgtlvaerekahegarpmrai fladgkvftt 
gfsrmserolalwdpenleepmaloeldssngallpfydpdtsv 
vyvcgkgdss iryfe i tee ppy1 hflntftskepqrgmgsmpkr 
glevskceiarfyklherkcepivmtvprksdlfqddlypdtag 
peaaleaeewvsgrdadpilislreayvpskqrdlk1srrnvls 
dsrpamapgsshlgapastttaadatpsgslarageagkleevm 
qelralralvkeqgdr1 crleeqlgrmengda 


6308 


2 


1118 


GRPTRPEKMLLSbVLHTySMRYLLPSWLLGTAPTYVLAWGVWR 
LLSAFLPARFYQALDDRLYCVYQSMVLFFFENYTGVQILLYGDL 
PKNKENIIYLANH0STVDWIVADILA1R0NALGHVRYVLKEGLK 
WLPIiYGWYFAQHGGIYVKRSAKFNEKEMRNKbQSYVDAGTPMYL 
VIFPEGTRYNPEOTKVLSASOAFAAORGLAVLKHVLTPRIKATH 
VAFDCMjOJYLDAIYDVTWYEGKDDGGORRESPTMTEFLCKECP 
KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMLIEFYESPD 
PERRKRFPGKSVNSFCLSIKKTLPSMLILSGLTAGMLMTDAGRKL 
YVNTW I YGTLLGCLWVT 1 KA 


6309 


220 


563 


LVAEVKEPCSLPMLSVDNENKENGSVGVKNSMENGRPPDPADWA 
VMDWNYFRTVGFEEOASAFOEQEIDGKSLLLMTRNDVLTGLQL 
KLGPAliKIYEYHVKPLCTKHLKNNSS 


6310 


36 


979 


GPRCWK FLI LSS VNCETLR 1GKAWPQSSGQERYWTPRTHSSASS 
AQRGS UVELNVAAAGLVJADCDQPLYDCPMCGLI CTWYH1LQEHV 
DLHLEENS FQQGMDRVQCSGDLOLAIIQLQQEEDRKRRS EESRQE 
IEEFQKLQRQYGLDNSGGYKQQQLRNMEIEVNRGRMPPSEFHRR 
KADMMESliALGFDDGKTKTSGIlEALHRYYQNAATDVRRVWLSS 
WDHFHSSLrGDKGWGCGYRKFOMLLSSLLQNDAYNDCr.KGMLIP 
CIPKIOSMIEDAWKEGFDPOGASDLIIRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


PVWWNSCEGPRLAAAARTGHGVGRRARLACLGEPRVKAAVKliTL 
ASKLKRDDGLKGSRTAATASDSTRRVSVRDKLLVKEVAELEANL 
PCTCKVH F PDPNKLHCFOLTVT PDEG Y YQGG KPQFETE VPDA YN 
MVPPKVKCLTKIWHPNITETGEICIjSLLREHSIDGTGWAPTRTL 
KDWWGLNSLFTDLLNFDDPLNIEAAEHHLRDKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKMLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTEEEAKQLAEEMNIAFYTSRTDDILLHQDVDLVCISIPPPLT 

rqisvkalgigknwcekaatsvpafrmvtasryypqlmslvgn 
vlrflpafvrmkqli sehyvgavmi cdari ysgsllspsygwic 
delmgggglhtmgtyivdllthltgrraekvhgllktfvronaa 
1rgi rhvts ddfcffqmlmggg vcs tvtlnfnmpgafvhe vm w 
gsagrlvargadlygqxnsatoeelllrdslavgaglpeqgpqd 
vpllylkgmvymvqalrqsfqgogdrrtwdrtpvsmaasfedgl 
ymqswdaikrssrsgeweavevlteepdtnqnbcealqrnnl 


6313 


2 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLSCSSF^EQPTAMEAEETMECLQEFPEKHKMILD 
RLNEQREODRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVE1 EGVSKMAFRHIiI EFTYTAKLMIQGEEEANDVWKAAE 
FLQMLEA2 KALE VRN KENS APLEENTTGKNEAKKR KI AETSNV I 
TESLPSAESEPVEIEVE1AEGTIEVEDEGIETLEEVASAKQSVK 
Y I QS TGS SDDSALALIiAD I TS KYRQGDRKGQI KEDGCPS D P TS K 
QVEGIEIVEIiQLSHVKDLFHCEKCNRSFKLPYHFKEHMKSHSTE 
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SEQ 
ID 

NO: 


Predicted 

V»«3>rr"i nn "i Tin 

nucleotide 
iocation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seouence 


Predicted end 

n 1 1 /»"| A /"if *! rt(C 

I oca t a on 
corresponding 
to firs: 
amino acid 
residue of 
amino acid . 
sequence 


Ammo acid segment containing signal peptide j 

Glutamic Acid # F= Phenylalanine , G=Glycine, 
H=Histidine, I=lsoleucine, K=l>ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Thr.eonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possiple nucleotide insertion) 








SFKCEICNKRYLRESAWKOHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKOFDHFGHFKEHLRKHTGEKPFECPNCHERFARKSTLKCH 
LTACQTGVGAKKGRKKLYEC0VCNSVFNSWDQFKDHLV1HTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMT3 1 EQVGKVHVLPLI.QVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSELPEQVOVSYLEVGRIQTEEGTEVHVEELHVERVNO 
MP VE VQTE LLE AD LDH VT PE 1 MNQE E R ES SOADAAE AAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFWVFMPLGVL 
FHRRRAHGCTliSCSSFVEOPTAMEAEETMECLQEFPEHHKMILD 
RLNEQRE0DRFTDITL1VDGHKFKAHKAVLAACSKFFYKFFQEF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKLM10GEEEANDVWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEI EVE 1 AEGTI EVEDEGIETLEEVASAKOSVK 
YIQSTGSSDDSAliALLADITSKYRQGDRKGOIKEDGCPSDPTSK 
OVEGIEIVEL.QLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
S F KCE I CN KR Y LR ES AW K OH LNC YH LE EGG VS KKQRTGKK I H VC 
QY CE KQ FDH FG HFKEHLR KHTGE KP FE CPNCHE R FARNS TL KCH 
LTACQTGVGAKKGRKKLYEC0VCNSVFNSWDOFKDHLVIHTGDK 
PNHCTLCDLWFMOGNELRRHLSDAHNISERLVTEEVLSVETRVO 
TEP VTSMTI 1 E Q VGKVHV LPLLQVQ VD SAQVTVE Q VHPDLLQDS 
QVHDSHMSELPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNO 
MP VE VQTELL.E ADLDHVT P E I MNQE ER ES SQADAAEAAREDHED 
AEDLETKFTVDSEAEKAENEDRTAjbPVIjE 


6315 


1 


1015 


LGLAVNWTTLVLISYCPTATEEAPYWTYLLCALGLFIYQSLDA 
IDGXQARRTNS CSPLGELFDHGCDSLS TVFMAVGAS IAARLGTY 
PDWFFSCSFIGMFVFYCAHWOTYVSGMLRFGKVDVTEIQIALVI 
VFVLSAFGGATM WDYTI PILEI KLK I L P VLGFLGGVI FSCSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGL.il ILAIMIYKKSATD 

Vr E.lSjiV\„L>i 1 LiPlr Vjv_ vr AM&VMj v VAfH*ll l\oDL I XAJUl vr XA?r 

GLLFLDOYFNN F I DE YWLWMAMVI SS FDMVI YFS ALCLQ ISRH 
LHLNI FKTACHQAPEQVQVLSSKSHQKNMD 


6316 


1503 


792 


VSAGAGTGI t^-G TTSTRRVTFEADENEN I TVV KG I RLSENVI DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAX ELDRERAAANEQLTRA I LRERI CSEEERAKAKHL 

aDOT.TTT'VrVP^/T VVnn&T?VVPnr.iDT TrVPQCFF'VRVTT13YYVn'K'A7X 
J\lK.\JLtr^l^]\Ut\. v Ltl\I\,\JlJJ\r lAtyilAKLttKoiljr Iftvl lAv* yJXtttt 

EEVE AKFKR Y E S H P VCADLQ AK I LQCYR ENTHQTLKCS ALATQ Y 
MHCVRHAKQSKLEKGG 


6317 


102 


83S 


PEAQTSAVLAREKGHLPTMRHEAPMOMASAQDARYGQKDSSDON 
FDYMFKLLI IGNSSVGKTSFLFRYADDSFTSAFVSTVGIDFKVK 
TVFKNE KR I KLQ I WDTAGQER Y RTI TTAY YRG AMG FI LMYD 1 TN 
EES FNAVQD WS TQ I KT YSWDNAQV ILVGNKCDMEDERVI STERG 
QHLGEQLGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAITAAKQNTRLKETPPPPQPNCAC 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLELRREAPPL 
LGPLLS PFPLPAGSVJHRQMLR S S LRFP1 TNSAGAPCKAAGRMN I 
LAPVRRDRVLAELPQCLRKEAALHGHKDFHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVIiGIPFSLQLWDTAC-OERFKCIASTYYRGAQAHIVFNLN 
DVASLEHT KC- WLADALKENDPS S VLL FLVGS KKDLST P AQYALM 
EKDALQ VAQEMKAE Y WAVS S LTG ENVR E FFFRVAALT FEANVLA 
ELEKSGAR R I GDWR INSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMRI^QNTLLLGKKVVLVPYTSEHVPSRYHEWMKSEELORLT 
ASEPLTLEOEYAMQCSWQEDADKCTFIVLDAEKWQAQPGATEES 
CMVGDVNL F LTDLEDLTLGE 1 EVMI AEPSCRGKGLGTE AVLAML 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 
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SEC- 
ID 
NO: 


Predicted 
nucleot i dp 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

1 nra t i on i 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

/TV — 23k. 1 s» r» ^ r\& C'— f^r/c he^no H-Iicnarhi r Ji^ii^ t<— 
Glutamir Ar*Sfi F=Phpnvl al am'tif G=G"i vr i 

H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *~Stcp 
Codon, /^possible nucleotide deletion, 
Vpossibie nucleotide insertion) 










6320 


90 


1111 


RPRTGREKVAMAAVDSFYLLYREIARSCNCVMEALALVGAWYTA 
RKSirVlCDFYSLIRLHFIPRLGSRADLIKQYGRWAWSGATDG 
IGKAYAEELASRGLNI ILISRNEEKLQWAKDI ADTYKVETD1 1 
VADFSSGRE I YLPI REAbKDKDVGI LVNNVGVFYPYPQYFTQbS 
EDKLWD 1 1 NVNI AAASLMVHWLPGMVERKKGAI VTI S SGSCCK 
PT PQLAAFS ASKAYLDHFS RALQYEY ASKGI FVQSLI P F YVATS 
MTAPSNFLHRCSVJLVPSPKVYAHHAVSTLGISKRTTGYWSHSIQ 
FLFAOYM PE WLWVWGAN I LNRSLR KEALS CTA 


6321 


14 IB 


341 


HR KAALCALMAGRLLGKAIaAAVS LS XJUjASVTIRSSRCRGIQAF 
RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNFSPK 
FNEKDGHVERKSKNGLYEIENGRPRNPAGRTGLVGRGLLGRWGP 
NHAADPI I TR WKRDSSGNKIMHPVSGiCHILQFVAIKRKDCGEWA 
I PGGM VD PG E K I S ATL KRE FGE EALNS LQ KTS AE KR E I E E KLHK 
LFSQDHLVIYKGYVDDPRNTDNAWMETEAVNYHDETGEIMDNLM 
LEAGDDAGKVKWVD1NDKLKLYASHSQFIKLVAEKRDAHWSEDS 
EADCHAL 


6322 


2047 


1083 


NQEILKNVESSRTVQPHFLEFLLSLGWSVDVGRHPGWTGHVSTS 
WS I NCCDDGEGSQOEEV I SSED3 GAS I FNGQKKVLYYADALTEI 
AFWPSPVESLTDSLESNISDQDSDSNMDLMPGIliKOPSLTLEL 
FPNHTDNLNSSORLSPSSRMRKLPOGRPVPPLGPETRVSWWVE 
RYDD I ENFPLS EbMTE 1 STGVETTANS STSLRSTTLE KE VPVI F 
IHPLNTGLFR I KI OGATGKFNMVI PLVDGMI VSRRALGFLVRQT 
V3NICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


6S6 


PASTTDGAQE AR VPLDGAFWI PRPPAGSPKGCFACVS KPPALQA 
P AA PAP EP S A S P PMAPTLFPMES KS S KTDS VRAAG AP P AC KHLA 
E KKTMTN PTTVI EVYPDTTEVNDYYLWS I FNFVYLNFCCLGFI A 
LAYS LKVRDKKLbNDLNGAVEDAKTDRLIN ITRS GLAASCIMLW 
MALS VI ATHRGLRSSAS I L VAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQO 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAE FW TDGQTE P AAAG LG VETERP KQ KTEPDR S S LRTHLEWS W 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEASPWTQPGVHGP 
WTELETHGSQTOPERVKSWADNLWTHQNSSSLQTHPEGACPSKE 
PSAIX3SWKELYTDGSRTOODIEGPWTEPYTIX5SOKKODTEAARK 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 

RVEGGSGGFSSA5SFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYSPFWS FRKHYPWVQLSGHAGNFQAGEDGR I LKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLQDGQTFNQKEDLLADFEGP 
S IMDCKMGSRTYLEEELVKARERPRPR KDM YEKMVAVDPGAPTP 
EEHACKSAVTKPRYMQWRETMSSTSTLGFRIEGIKKADGTCN'l'NF 
KKTOALFnVTKVT.FDFVDGDHVTT J1KYVACLEELREALE12PPF 
KTHFATVGSSLLFVHDHTG1AKVWM1DFGKTVALPDHQTLSHRLP 
WAEGNREDGYLWGLDNMICLLQGLAQS 


6325 


165 


944 


GLRDPFRRKRRLKP0VKMSNYVND>5WPGSPQEKDS PSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRS RTP FRLS EKBRMELLE I AKTN AAKALGTTN I DL P AS LRT 
VPSAKETSRGI GVS SNGAKPEVS ILGLSEQNFQKANCQI 


6326 


23B 


680 


GEPS PATQOKPS ATGAGVLHOHFSSGH I YVLMGLLP P?WTI S FT 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRTQTb 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 



486 



BNSDOCID: <WO 0153312A1_I_? 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 

NO: l 

| 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
seguence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid eepment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




j 


QAWGGVGOEASSGVP 


6327 




1337 


SaARLAPAGGSWMPTQQPAAPSTRAPKPSRSLSGSbCALFSDA 
DSGSGMKAELPPGPGAVGREMTKEEKLQLRKEKKQQKKKRKEEK 
GAEPETGSAVSAAQCQGPTRELPESGIQLGTPREKVPAGRSKAE 
LRAERRAKOEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RQNSLTOFMSIPSSVIHPAMVRLGLOYSQGLVRGSNARCIALLR 
AIjQQVIODYTTPFNEELSRDLVNKLKPYMSFLTQCRPLSASMHN 
AIKFLNKEITSVGSSKREEEAKSELRAAXDRYVQEKIVLAAQA1 
S RFAYQK I SNGDVIL VYGCSSLVSRILQEAWTEGRRFRWWDS 
R PWLEGRHTLRS LVKAGVPAS YLLI PAAS YVLPEVSTESKDS KV 
GGEKV 


6328 


1030 


276 


HAS AE VTTAAAR G LG AMEEEMHTDAK I RAENG TGS S PRG PGCS h 
RHFACEQNLLSR PDGS ASFLQGDTS VLAGVYGPAEVKVSKE1 FN 
KATLEVIliRPKIGLPGVAEKSRERLIRNTCEAWLGTLHPRTSl 
TWLQWS DAGS LLACCLNAACMALVDAGVPMRAliFCGVACALD 
S DGTLVLD PTS KOEKE ARAVLTFALDS VERKLLMSSTKGLYSDT 
ELQQCLAAA0AAS0HVFRFYRESL0RRYSKS 


6329 


3 


2016 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGbNNGAGGTSATT 
SNPLSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDI ERKS LAI NEEFVS I FKE VKEELES I £ EDVQAMSNCCQDMT 
SRL0A7AKEQTQDLIVKTTKLQSESQKLEIRAQVADAFLSKF0LT 
S DEMSLLR G'J 'R EG P 1 TEDFFKALGRVKQ I HNDVKVLLRTNQQT A 
GLEIME0J4ALLQETAYERLYRWAQSECRTLTQESCDVSPVLTQA 
MEALQDR P VLY KYTLD E FGTARRSTWRGF I DALTRGG PGGT PR 
PIEMHSHDPLRYVGDMI^^HQATASEKEHLEALLKHVTTOGVE 
ENIOBWGHI TEGVCRPLKVR I EQVI VAEPGAVLLYKI SNLLKF 
YMHTISG I VGNSATALLTTI EEMHIjLSKKI FFNSLSLHASKLMD 
KVELPP PDLGPSSAJbNQTLMLLREVLASHDSS WPLDAR OADFV 
OVLSCVLDPLbQMCTVSASNLGTAI)^TFMV>ISLY^KTTljALF 
EFTDRRLEMLQFOI EAHLDTLI NEQASYVLTRVGLSY I YNTVQQ 
HKPEOGSI^ANMPNLDSVTLKAAMVQFDRYLSAPDNLLIPQLNFL 
LSATVKEQ2VKQSTELVCRAYGEVYAAVMNPINEYKDPENILHR 
SPQQVQTLLS 


6330 


1151 


333 


FFYYTFYENKTFSRKMVAEKETLSLNKCPDKMPKRTKLLAQOPL 
P VHQ P H S L VS EG FT VKAMMKNS WRG PPAAGAFKERPT K PT AFR 
KFYERGDFPI ALEHDS KGNKI AWKVEI EKLDYHHYLPLF FDGLC 
EMTFPYEFFAROGIHDMLEHGGNKILPVLPQLIIPIKNALNLRK 
RQVICVTLKVLQHLWSAEMVGKALVPYYRQILPVLNI FKNMNV 
NSGDGIDYSOOKRENIGDLIQETLEAFERYGGENAFINIKYWP 
TYESCLLN 


6331 


3 




QQGQR VRTRGR RACAS ATPLrEG CVDLS YPRTHAALLKVAOMVTL 
LIAFI CVR SSLWTNYSAYSY FEWTICDLIM I LAFYLVHLFRFY 
RVLTCI SWPLS ELW YLIGTLLLLIASI VAAS KS YNQSG LVAGA 
IFGFMATFLCMASIWLSYKISCVTQSTDAAV 


6332 


1 


878 


.VTESNKFDLVSFIPLLRERIYSNNQYARQFIISWILVLESVPDI 
NLLDYLPE I LDGLFQI LGDNGKEI RKMCSWLGEFLKE I KKNPS 
S VKFAEMANI LV IHCQTTDDLIQL.TAMCWKREFI QLAGR VML PY 
S SG I LTAVLP CLAYDDR KKS I KE VANVCNQSLM KLVT PE DDE LD 
ELR PGOROAE PTPDDALPKOEGTASGEWTPSLHLTSCRG PRE PD 
VIGVALGPHLSNQDYFMYVTHTIVAATORSGSSGSPPFCRQDTG 
KLSTMATHSOLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 


trtpseaeaggespqscvsaahsdwtagkpvsllaplipprsag 
qpltfspsgrqpl3sllvgmcsgsgrrrsslsptmrpgtgaerg 
glmmghpgmhyapmgmhpmgqrakmppvphgmmpqmmppmggpp 
kgqmpgmmssvmpgmmmsht^jsqasmqpalppgvnsmdvaagtas 
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BNSDOCiD: <WO___01533l2A1 J_> 
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| SEQ 
ID 
NO: 


Predicted 
beginning 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(AsAlanine, OCyeteine, D=Aspartic Acid,' E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, 3>Isoleucine, X=Lysint, 
L=Le'Jcine, M=Methionine, N=Asparagine , 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine r 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GAKSMWTEHKSPDGRTyyYNTETKQSTWEKPDDLKTPAEOLLSK 
CPWKEYKSDSGXPYYYNSQTKESRWAKPKELEDLEGYQNTIVAG 
SLITXSNLHAMI KAEESSKQEECTTTSTAPVPTTE3 PTTMSTMA 
: AAE AAAA WAAAAAAA AAAAAANANASTS ASNT VSG T V P WPE P 
EVTS I VATWDN ENTVT I STEEQAQLTSTPAI QDOS V EVS SNTG 
EETSKQETVADFTPKKEEEESOPAKKTyTWNTKEEAKOAFKELL 
KEKR V PSNAS WEQAMKMI INDFRYSALAKLS EKKQAFNAYKVQT 
EKK 


6334 


. 17 


644 


GGNPSG RAAGFAAAAM PS S P LR VAWCSSNQNRS ME AKN i bS KR 
G FS VR S FGTGTH VKLPG P A P DK PNVYDFKTTY DQ MY N D LLR KD K 
ELYTONGII.HMLDRNKRIKPRPERFON'CKDLFDLILTCEERVYD 
QWEDLNSREOETCQP VH WNVDI QDN1IEEATLGAFL I CELCQC 
IQHTEDMENEIDELLQEFEEKSGRTFLHTVCFY 


6335 


82 


529 


AARARFGVLCCRLLGAALGDQSRVEMSYIPGQPVTAWORVEIH 
KLRQG ENLI LG FS I GGGIDQDPSQNPFSEDKTDKGI YVTR VSEG 
GPAEI AGliQIGDKI MQVNGWDMTMVTHDOARKRLTKRS EEWRL 
LVTRQSLQKAVQQSMLS 


6336 


1003 


438 


H E PAS KGRAEVGN M RLSVAAA 1 SHGRV FRRMGLG PESRIHLLRN 
LLTGbVRHERIEAPWARVDEMRGYAEKLIDYGKLGDTKERAMRM 
ADFWLTEKDLI PKLFQVLAPR YKDQTGGYTRMLQI PNRSLDRAK 
MAVIEYKGNCLPPLPLPRRDSHLTLLNQLL0GLR0DLRQS0E7\S 
NHSSHTAQTPGI 


6337 ■ 


7€ 


524 


EGIOMLSVOPDTKPKGCAGCNRKIKDRYLLKAUJKYWHEDCLKC 
ACCDCRLGEVGSTLYTKANLILCRRDYLRLFGVTGNCAACSKLI 
PAFEMVMRAKDNVYHLDCFACOLCNQRFCVGDKFFLKNNMILCQ 
TDYEEGLMKEGYAPQVR 


6338 


66 


1349 


APNSESGTOGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP 
GLRIALLLLLGIiGTPKSGVQGOEGLDFPEYDGVDRVINVNAKNY 
KNVFKKYEVLALLYHEPPEDDKASQRQFEMEELILELAAQVLED 
KGVG FGLVDSEKDAAVAKKLGLTEVDSMYVFKGDEVI EYDGEFS 
ADTIVEFLLDVLEDPVELIEGERELQAFENIEDEIKLIGYFKSK 
DSEHYKAFEDAAEEFHPYI PFFATFDSKGAKKLTLKLNEI DFYE 
AFKEEPVTIPDKPNSEEEIWFVEEHRRSTLRKLKPESMYETWE 
DDMDG I K I VAFAEE ADPDGFE FLETLKAVAQDNTENP DLS 1 1 W I 
DPDDFPLLWyWEKTFDlDLSAPQIG\nmVTDADRbWMEMDDEE 
DLPSAEELEDWLEDVLEGEINTEDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSOGAiMKAFH 
TFCWLLVFGSVSEAKFDDFEDEEDIVEYDDNDFAEFEDVMEDS 
VTESPQRVIITEDDEDETTVELEGQDENQEGDFEDADTOEGDTE 
SEPYDDEEFEGYEDKPDTSSSKNKDPITIVDVPAHLONSWESYY 
LEILMVTGLLAY 1 MNY I IGKNKNSRLAQAWFNTHRELLESNFTL 
VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCEGMLIOLRFL 
KRQDLLN^/IiARl^RPVSDQVQIKVTMNDEDMDTYVFAVGTRKAL 
VRLOKEM0DLSEFCSDKPKSGAKYGLPDSLAILSEMGEVTDGMM 
DTKMVHFLTH YADKI ESVHFSDQFSGPKI MQEEGQPLXLPDTKR 
TLLLTFNV PGSGNT Y P KDMEALLPLKNM VI YS I DKAKXFRLNRE 
GKQKADKJ4 RARVEENFLKLTHVQRQEAAOSRREEKKRAEKER I M 
NE EDJ? E KOKR LEEAALRREQK K LE KKQM KM A.£>I K V KAM 


6340 


2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTSSTFRAERS FHSSSSS 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
P ARPGGAGN 1 KTLGDAY EFAVD VRDF S P ED I IVTT SNNHIEVRA 
EK1J^ADGTVMNNFAHKCQLPEDVDPTSVTSA1,REDGSLTIRARR 
HPHTEKVOQTFRTEIKI 


6342 


2 


6-45 


KI4AVLSAPGLRGFRILGLRSSVGPAVQARGVHQSVATDGPSSTQ 
PALPKAJ^VAPKPSSRGEYVVAKIjDDLViaWARRSSLWPMTFGlA 
CCAVE^MHrAAPRYDMDRFGVVFRASPRQSDVMIVAGTIjTNKMA 
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SEQ 
ID 
NO: 


Predicced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F*= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=?roline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PALRKVYDQMPEPRYWSMGSCANGGGYYHYSYSVVRGCDRIVP 
VDI Y I PGCPPTAEALLYGI LQLQRKIKKERRLOIWYRR 


6342 


2 


us: 


DPRVRAMLATLARVAALRKTCLFSGRGGGRGLWTGRPQSDMNNI 
KPLEGVKI LDLTRVLAGPFATMNLGDLGAEVI KVERPGAGDDTR 
TWG ? P r VGTE ST YYLS VNRNKKS I AVN I KDPKGVKI I KELAAVC 
DVFVENYVPGKLSAMGLGYEDIDEIAPH1IYCSITGYGQTGPIS 
ORAGYDAVASAVSGLMHITGPJEVACLSmAAWYLIGQKEAKRWG 
TAHGSIVPYOAFKTKDGYIWGAGIWQQFATVCKILDLPELIDN 
SKYKTNHLRVHNRKELIKILSERFEEELTSKWLYLFEGSGVPYG 
P I NNM KNVFAE POVLHNG LVMEME K ? T VGK1 S V PGPA VR Y S KFK 
MS EAR PF PLLGQHTTH1 LKEVLR YDDRA1GELLS AG WDQHETH 


6343 




93b 


GTAM VSDEDELNLLV 1 WDANP I WWGKQALKESQFTLS KCI DAV 
MVLGNSHLFMNRSNKLAVIASH1QESRFLYPGKNGRLGDFFGDP 
GNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIKGQHT 
ETLLAGSIAKALCYIHRMNKEVKDNOEMKSR I LV I KAAEDS ALQ 
YMNFMNVIFAAQKQNILIDACVLDSDSGLLQQACDITGGLYLKV 
POMPS LLQYLLWVFLPDQDORSQLI LP PPVHVDYRAACFCHRNL 
IE1GYVCSVCLSIFCNFSPICTTCETAFKISLPPVLKAKKKKLK 
VSA 


6344 


2508 


147 


TM PTATLGNLRG YG WAS PGLAAPS LT P PQLAT PNLQQFFPQATR 
OSLLG P PPVGVPMNPSQFNLSGRN PQKQARTS S STTPNRKDS SS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRLRSSEEPTEKEPPGQLQVKAQPQARMT 
VPKQTOTPDLLPEALEAQVLPRFOPRVLQVQAQVQSQTQPRIPS 
TDTQVOPKLOKQAQTQTSPEHLVLQQKQVQPQbQQEAEPQKQVQ 
POVQPOAHSQGPRQVQLQQEAEPLKQVQPOVQ PQAHSQP PRQ VQ 
LOLOKOVQTQTYPQVHTQAQPSVQPQEHPPAQVSVOPPEQTHEO 
PHTQPQVSLLAPEQTPVVVHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVS MEE I QNE S ACGLDVGECENRAR EMPG VWGAGGS LKVTI 
LQS S DS RAFSTVPLTPVFRPS DSVSS TPAATST PSKQALQ FFCY 
I CKAS CS S OQE FQDKMSEPQHQQRLGE I QHMSQACLLSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDL1QHRRTQDHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKSOGHKDKAKELKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEELCKQVRSRDIS 
REEWKGSETYSPNTAYGVDFLVPVKGYICRICHKFYHSNSGAQL 
SHCKS LGH FENLQKY KAAKNPSPTTRP VS RRCAINARNALTALF 
TSSGRPPSQPKTQDKTPSKVTARPSOPPLPRRSTRLKT 


6345 


2 


3483 


PRVRTKL I LLVNDKKRYERVGGGPKRLGRDVEKEEMI EOI»OEKV 
HELEKONDTLKNRiISAKOX3LQTOGYRQT?YNNVQSRINTGRRK 
ANENAGLQECPRKG I KFQDADVAETPHPMFTKYGNSLLEEARGE 
IRNLENVI QSCRGQI EELEHLAEI LK7QLRRKENE IELSLLQLR 
EQQATDQR SNIRDNVEKI KLH KQLVEKSNALSAMEGKFIQLQEK 
0RTLK2SHDALMANGDELNMQLKEQRLKCCSLEK0LHSMKFSER 
RIEELQDRINDLEKERELLKENYDKLYDSAFSAAHEEQWKLKEQ 
0LKVQIA0LETALKSDLTDKTEILDRLK7ERDQNEKLVQENREL 
QLQYLEQKQQLDELKKR I KLYNQEND I NADELSEALLLI KAQKE 
QKNGDLS FbVKVDSE I NKDLERS MR ELQATHAET VQEL3KTRNM 
LI MQHK I NKDYQMEVEAVTRKMENLOODYEbKVEQYVHLLDI RA 
ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 
ENLFE IK I NKVTFSS EVLQAS GDKEPVTFCTYAFYDFELQTTPV 
VRGLH P E YNFTSQ YLVHVNDL FLQY I QKNTI TLEVHQAYSTEYE 
TIAACQLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 
RVPMDOAI RLYRERAKALGYI TSNFKGPEHMQS LSQQAP KTAQL 
SSTDSTDGNLNELHI TI RCCNHLOSRASHLOPHPYWYKFFDFA 
DHDTA1 1 PSSNDPQFDDHMYFPVPMNMDLDRYLKSESIiSFYVFD 
DSDTQENI YIGKVNVPLISLAHDRCI SGI FELTDHQKHPAGTIH 



489 



BNSOOCID: <WO_0153312A1_I_= 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ci 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=7ryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNF3RSEEPEWQRLPPASSVST 
LVIAPRPKPRQRJjTPVDKKVSFVDIHPHQSDVSQEGSVDEVKEN 
TEKMQQGKDDVSLLSEGOXAEOSLAS SEDETEI TEDL-EP EVEED 
MSASDSDDCI XPGPISKNI KQPSEKI RIEI IALSLNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIYVDK 
ENNKAKRDILKAILOKQEMPNRSLRPTWSDPPEDEQDLECEDI 
GVAHVDLADWFQEGRDLI EONIDVFDARADGEG XGKLRVTVEAL 
HALQSVYKQYRDDLEA 


6346 


2921 


53? 


ODRRLLRLELQKTCQPTSTMSGSKTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQR LR WQAH LE FTHNHDVGDLTWD K I AVS LPRS EKLR S L VLA 
GIPHGMRPQLWMRLSGALOKKRNSELSYREIVKNSSNDETIAAK 
QI EKDLLRTMP SNACFASMGS IGVPRLRRVLRALAWLYPE IGYC 
QGTGM V AACLLLFLEEEDAFWMMS A I I EDLLPASYFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLOEHDIELSLITLHWFLTAFASVV 
DI KLLLRI WDL F FYEGSR VLFQLTLGMLHLKEEE LI QS E N S AS I 
FNTLSDI PSQMEDAELLLGVAMRLAG S LTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTELVADLR EAI LR VARHFQCTDPKNCS WSROLPGLli 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRXNDI I TI VSQKDEHCWVGELNGL 
RGWFPAXFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALX 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFR LDEDG KVLT PEELLYRAVQS VN VTHDAVK^QMDV KLR SL 
ICVGLNEQVLKLWLEVLCSSLPTVEKWYOPWSFLRSPGWVQIKC 
ELRVLCCFAFSLSOBWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVBG 


6347 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPS I WPQE IIi 
AKYTQKEES AEQ P EF Y YDE FG FR VYKEEGDEPGSS LLANS PLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
G I PHGMRPQLWMRLSGALQK KRNS ELS YRE I VKNSSNDET I AAK 
QIEKDLLRTMPSNACFASMGS I GVPRLRRVLRALAWLYPSIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQRVLRHLI VQ YLPRLDKLLQEHDI ELSLI TLHWFLTAFAS W 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSDIPSQMEDAELLLGVAMRLAGSLTDVAVETORRKHLAYL 
IADQGQLLGAGTLTNLSQWRRRTORRKSTITALIjFGEDDIjEAL 
KAKNIKQTELVADLREAILRVARHFOCTDPKNCSWSRQLPGLL 
PNTALT P PTPLVGLYSLWQELTPDYSMESHQRDHEN YVACSRS H 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPKLF I EEAAGREVERDFAS VYSRLVL 
CKTFR LDEDG KVLTPEELLYRAVQS VNVTHDAVHAQMDVKLRS L 
ICVGLNEQVLHLWLEVIiCSSLPTVSKWyOPWSFLRSPGWVQIKC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRIM4LVKHHLFSW 
DVjDG 


6348 


3 


3675 


AGAEKCFVTLLACFLAKCONKYKYEECKDLIKSMLRNELQFKEE 
KLAEQLKyAr.c.LRQYKVL VHSy tRfcLi i yijKfciUjKJitxKUAoK£>LiJN 
EHLQALLTPDEPDKSQGQDLQEQIJ^GCRIAQHLVQKLSPENDN 
DDDEDVQVE VAEKVQKS S S PREMQKAEEKEV PEDSLE E CAI TCS 
NSHG PCDSNQ PHKN I Kl T F EEDEVN STLWDRESSHDE CQDALN 
I LP V PGPTS S ATNVSMWS AGPLSGE KAAI N I LEINE KLRPQLA 
EKKQQFRNLKEKCFLTOLACFLANQQNKYKYEECKDLIKFMLRN 
ERQFKEEKLAEQLKQAEBLRQYKVLVHSOERELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LS PENDNDDDED VQVE VAEKVQKS SAPREM PKAEEKEVPEDSLE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
1 oca tier, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine. CsCvstein^ n— JX^r»a"»"h i c Ar*irf P— 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine , V=Valine, ■ 
W=Tryptophan, Y«= Tyrosine, X=Unknown, *=Stcp 
Codon, /»=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVSPRNLQESEEEEVPQES WDEG 
YSTbSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGFEVLQDSLDRCYSTPSGCLELTDS 
COPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEWEPEVLQD 
SLDRCySTPSSCLEOPDSCOPYGSSFYAbEEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DODPSCPRIjSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGO 
PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGE3EKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
SFEEEHI SFALY VDNRFFTLTVTSLHLVFCMGVI FPQ 


6349 


3 


3679 


agaekcfvtllacflakqonkykyeeckdliksmbrnelqfkee 
klaeqlkqaeelrqykvlvhsqereltolreklregrdasrsln 
ei ilqalltpdepdksog0dlqeqlaegcrla0hlvqkls pendn 
dddedvqvevaekvqksss premqkaeekevpedsleecai tcs 
kskgpcdsnqphknikitfeedevnstlwdresshdecodaln 
i lpvpgptssatnvsmwsagplsgekaaini le i neklrpqla 
ekkqqfrnlkekcfltoiacflajnqqnkykyeeckdlikfmlrn 
erqfkeeklaeqlkqaeelrqykvlvhsqereltqlreklregr 
d as rs lnehlqalltpdep d ksqgqdlqeqlaeg cr laqhlvqk 
ls pendndddedvqvevae kvqkss aprempkaeekevpeds le 

ECA1TCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVSPRNLOESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWCQVK 
KEDHEATGPRLSRELLDEKGPEVLODSLDRCYSTPSGCLELTDS 
CQP YRS AFYVLEQQRVGLAVNMDE I EKYQE VEEDQDPS CPRLS R 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
OYLGLALDVDRIKKDQEEEEDQGPPCPRLSRELLEWEPEVLQD 
S LDRCY ST PS S CLEQ PDSCQ P YG S S FYALES KHVG FS LDVGE I E 
KXGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

DMDE I E K YQEVE E DQDPSC PR LSG ELLDE KE PEVLQE SLDR CYS 
TPSGCLELTDSCQPYRSAFY I LEQQRVGLAVDMDE I EKYOEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PySSAVYSLEEQYLGLALDVDRlKKDOEEEEDQGPPCPRLSREl, 
LEVVEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDONPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEH I S FALYVDN R FFTLTV7S LHLVFQMGV I FPQ 


6350 


3 


367S 


AGAEKCFVTLLACFLAKQONKYKYEECKDLIKSMLRNELQFKEE 
KLA£QLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSS PREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLVVDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQIACFliANQQNKYKYEECKDLIKFMLRN 
E RQ FKEEKLAEQLKQAEELRQ Y KVLVH SQE R E LTQLR E KLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LS P ENDNDDDEDVQVEVAE KVQKS S APREMPKAEEKEVPEDS LE 
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SEQ ^ 

Tn 
xu 

NO: 


Predictec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
sx.ino acid 
residue of 
amino acid 
Sfecoence 


Amino acid secment containing signal peptide 
(A-Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I =Isoleucine, K=Lysine, 
L=beucine, M=Methionine, N=Asparagine , 
P=Proline, 0=Glutamine, R^Arginine, 
S=Serine, T= Threonine, V=Valine, 
W =T tryptophan, Y^Tyrosine, X«=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGFYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVH11 PENES DDEEEEEKGPVSPRNLQESEEEEVPOESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCKAVD1GRHRWD0VK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQPYRSAFYVLEQORVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVIiQDSLGRCYSTPSGYIjELPDLGQPYSSAVYSLEE 

qylglaldvdrakkdoeeeedqgppcprlsrelojewepevl^od 
sldrcystpsscleqpdscqpygssfyaleekhvgfsldvgeie 
kkgkgkkrrgrrskkerrrgrkegeedqnppcprlsrelldekg 
pevlqdsldrcystpsgcleltdscqpyrsafyileqqrvglav 
dmde i ekyqeveedodpscpklsgelldexepevlqesldrcys 
tpsgcleltdscqpyrsafyileqqrvglavdmdeiekyqevee 
dodpscprlsrelldekepevlqdslgrcystpsgylelpdlgo 
pyssavysleeoylglaldvdrikkdqeeeedqgppcprlsrel 

LE W EPE VLQDSLDR C Y STPS S CJjEQPDS CQ PY GSS F Y ALEE KH 

vgfsldvgeiekkgkgkkrrgrrskkerrrgrkegeedqnppcp 
rlnsmlmeveepevlqdsldicystpsmyfelpdsfqhyrsvfy 
sfeeehisfalyvdnrfftltvtslhlvfqmgvifpq 


6351 


1791 


319 


REAR RRTER S QLGRMLW EVANGRS LVWG AE AV QALRERIjGVGG 
RTVGALPRGPRONSRLGLPLLLMPEEARLliAEIGAVTLVSAPRP 
DSRHHSLAJUTSFKRQQEESFQEQSAliAAEARETRROELLEKITE 
GQAAKKQKLEOASGASSSQEAGSSOAAKEDETSDGQASGEOKEA 
GP SS SQAGPSNGVAPLPRSALLVQ1ATARPR pvkar pldwrvqs 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHY3AOCWAPEDTIPLODLVAAGRLGTSVRKTLLLCSP0 
PDGK WY TSLQWAS LQ 


6352 


235 


S23 


WSEWLSPCHAAKCKGLSMbRlTOKTRAISLAADATEFVQGRSAP 
AMAR S LVHDTVF YCLS VYQVKI SPTPQLGAAS S AEGHVGOGAPG 
LMGNMNPEGGVNKENGMNRDGGMIPEGGGGNOEPROOPQPPPEE 
PAQAAMEGPQPENMOPRTRRTKFTLLOVEELES VFRHTQY PDVP 
TRRELA3NL<3VTEDKVRVWFKNKRARCRRHQRELMIJ^NELPJ\DP 
DDCVYIWC 


6353 


65 


672 


RFAGAGAI PEARAR PPDVQAAEEEKEMDLPDS AS RVFCGRI LSM 
VNTDDWAIILAOKNMLDRFSKTNEMLLNFNNLSSARLQOMSER 
FLHKTRTLVEMKRDLDSIFRRIRTLKGKLARQHPEAFSKIPEAS 
FLEEEDEDPIPPSTTTT1ATSEQSTGSCDTSPDTVSPSLSPGFE 
DLSHVQPGSPA1NGRSQTDDEEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNOEVFC 
H PVTDOSLI VELLELOAHVRGEAAARYHFEDVGGVQGARAVHVE 
SVQPLSLENLALRGRCOFAWVLSGKQQIAKENQOVAKDVTLHQA 
LLRLPOYQTDLLLTFNQPP 


6355" 


15B 


1662 


RGSSAAFRGSGLRGAMIRRVLPHGMGRGLLTRRPGTRRGGFSliD 
WDGKVS EI KKKIKS I LPGR S CDLIjQDTSHLPP EHSDVVI VGGG V 
LGLS VA Y WLiKKLES R RGAI RVLWERDHTYSQASTGLSVGGI CQ 
QFSLPEJvf I QLSLr SAS FIjRN iNfc YIiAVVDAPPIjDIjRFNPSG YLL 
l^SEKBAJVAMESKVKVQROEGAKVSLMSPDQLRNKFPWINTEGV 
ALASYGMEDEGWFDPWCLl^GLRRKVQSLGVLFCQGEVTRFVSS 
S QRMLTTDD KAWLKR I HE VHVKMDRS LEYQPVECA I V I NAAGA 
WSAQIAALAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPOGPGL 
ETPLVADTSGAYFRREGLGSNYIX?GRSPrEQEEPPPANLEVDHD 
FFQDKVWP HliALRV P AF ETLKVQS AWAGYYDYNTFDQNG WG PH 
PLWNMY FATGFSGHGLCX)APGIGRAVAEMVLKGRFQTI DLS PF 
LFTRFYLGEKIQENNI I 


6356 


354 


633 


TGLTSS CL PLQ VM MTKR TKDKG KFSS VT VS T I DE E EEE I E AR E V 
ADSYAONAKVIEKOLERKGMSKRRLQELAELSAKKAKMKGTLID 
NQFK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aepartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycint, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 

1 
I 
! 

! 


2 


915 


GLLRNMALLVRVLRNQTSISQWVPVCSRLIPVSPTQGOCDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
QPVEEKVGAFTKI I EAMGFTGPLKYSKWK1 KI AALRMYTS CVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLVRMKOEGRSGKYM 
CR 1 1 VH FMW EDVCQRGRVMGVNP YI LKKNMI LMTNHFYAAI LGY 
DEG I LS DDHG LAAAXiWRT FFNR KCEDPRKLELL VEYVRKQ I Q YL 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 


1040 


ASDALHSLS APV LRLS S RS AARPATMTEQAI S FAKD FLAGG I AA 
AISKTAVAP3 ERVKLLLQVQHASKOIAADKOYKGI VDCI VRI PK 
EQGVLSFWRGNLANVI RYFPTQALNFAFKDKYKQI Fl^GVDKHT 
CFWRYFAGNIiASGGAAGATSLCFVYPLDFARTRLAADVGKSGTE 
REFRGLGDCLVK I TKSDG I RGLYQG FS VSVQG 1 1 1 YRAAYFGVY 
DTAKGMLPDPKNTHIWSWMIAQTVTAVAGWSYPFDTVRRRMM 
M0SGRKGADIMYTGTVDCV3RKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKV1 


6359 


98 


1086 


VCRQEEEKMKJBDCLPS S HVPI SDS KS I QKS ELLGLL.KTYNC YHE 
GKS FQLRHR E E EGTL 3 1 EG LLN I AWGLRR P 1 R LQMQDDRE 0 VHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASOJSQRRPKCRAPGEAQRIRRHRFS 
INGHFYNHKTSVFTPAYGSVXI^VRVNS'TKTTLQVLTLLLNKFRV 
EDGPSEFALYIVHESC-ERTKLKDCEYPLISRILHGPCEKIARIF 
LNEADLGVEVPHEVAQYI KFEMPVLDSFVEKLKEEEEREI I KLT 
MKFQALRLTMLQRLEQLVEAK 


j 6360 


1 


345 


GTRGAVPSTLEEVVLPPRSCRVFWIHSGTTMSKVSFKITLTSDP . 
RLP YKVLSVPESTPFTAVLKFAAEEFKVPAATS AI ITNDG IGIN 
PAQTAGNVFLKHGSELRI I PRDRVGSC 


6361 


615 


ise 


R PG LGQLQHCALA PO AGNRRCRFHGRiiHALTR S TH RG KPM S I MQ 
FKDTLNTPLPDSS PVAVPLGAP3 AVASTLSVEHNDGVETGI WAC 
APGRWRRQITSOEFCHFIQGRCTFTPDDGETLHIOAGDALMLPA 
NSTGIWDIQETVRKTYVLIL 


6362 


350 


1576 


TTKDGSHSAALKLOOLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWS I ALNLQK Y CH 1 RliAG S KD P RA Y FKTKTWW LGLFLMLLG 
ELGVFASYAFAPLSLIVPLSAVSVIASAIIGI3FIXEKWKPKDF 
L RR YVLS FVGCG LAWGTY LL VT FAPN SHEKMTGENVTRHL VS W 
PFLLYMLVEI 3 LFCLLLYFYKEKNANNI WILLLVALLGSMTW 
TVKAVAGMLVLS I OGNLQLD YPI F YVMFVCMVATAVYQAAFLSQ 
ASQMYDSSLIASVGYILSTTIAITAGAIFYLDF3GEDVLHICMF 
ALGCLIAFLGVFLITRNRKKPIPFEPYISMDAMPGMONMHDKGM 
TVOPELKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKE 


6363 


21 


1201 


RRTRLGSSFPRRRDSSAI>1ESYDVIANQPWIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRV14AGALEGDIFIGPKAEEHRGLLSI 
R YPMEHGI VKDWNDMER I WQYVYSKDQLQTFSEEHPVbLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPIYEGFAMPHS I MRIDIAGRDVSRFLRLYLRKEGY 
DFHSSSEFEIVKAIKERACYLS1NPQKDETLETEKAOYYLPDGS 
TIEIGPSRFRAPF.LLFRPDLIGKESEGIHEVLVFAIOKSDMDLR 
RTLFSN I VLSGGSTL FKGFGDRLLSE VKKLAPKDVKI RIS APQE 
RLYSTW1GGSILASLDTFKKMWVSKKEYEEDGARSIHRKTF 


6364 


21 


1201 


RRTRLGSSFPRRRDS3AMESYDVIANQPWIDNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGI VKDWNDMERI WQY VYS KDQLQTFS EEH PVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAV PI YEGFAMPHS I14RID1 AGRDVSRFLRLY LRKEGY 
DFHSSSEFEIVKAIKERACYLSINPQKDETLETEKAQYYLPDGS 
TI2IGPSRFRAPELL7RPDLIGEESEGIHEVLVFAIQKSDMDLR 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Clutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Prcline, Q=Glutamine, R=Arginine, 
S-Se rine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGSILASLDTFKKMWVSKXEYEEDGAKSIHRKTF 


6365 


234 


1999 


KHKSRASCAARAQAFGPSREREVHSRFRSGLRRLGESNSGCCTM 
ASMGTLAFDEYGRPFLIIKDQDRKSRLMGLEALKSHIKAAKAVA 
NTMRTSLGPNGLD^MVDKDGDVTVTNDGATILSKMDVDHQIAK 
LMVELS KSODDE I GCGTTGVWLAGALLEEAEQLLDRG IHPIRI 
ADGY EQ AAK V A I EH LDKI S DS VLVD I KDTE PL 1 OTAKTTLG S KV 
VKSCHROMAEIAVNAVLTVADKERRDVDFELIKVEGKVGGRLED 
TKLIKGVIVDKDFSHPQMPXKVEDAK1AILTCPFEPPKPKTKHK 
LDVTSVEDYKALOKYF.KEKFEEMIQQIKETGANLAICOWGFDDE 
ANHLLLQNKLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
GFAGLVQE1 SFGTTKDKMLVIEOCKNSRAVTIFIRGGNXMI IEE 
AKRS LHDALCVI RNL3 RDNR VVYGGGAAEISCALAV5QEADKCP 
TLEOYAMRAFADALEVIPMALSENSGMNPICTMTEVRARQVKEM 
NPAIjG I DCLH KGTNDMKQQH VI ETL I G KKQOI SLATQMVRM I LK 
IDD3 RKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFWVLbS 1 FLGAVAMLCKEQGITVLGLNAVFDILV 
I G KFNVLE 1 VQ K VLH KD KS LENLG M LRNGGLLFR MTLLTS GG AG 
ML YVR WR I MG TG P P A FTEVDNPAS FADS MLVRAVN YN YY Y S LNA 
WLLLC PWWLC FDWSMGCI PLIKSIS DWRVI ALAALW FCL IGLIC 
OALCS EDGKKRR I LTLGLGFLVI P FLPASNLFFRVGFWAERVL 
YLPSVGYCVLLTFGFGALSKHTKKKKLIAAWLGILFINTLRCV 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRl^PKYVHAMNNLGNILKERNELQEAEELLSLAVQIQ 
PDFAAAWMNLGIVQNSLKRFEAAEOSYRTAIKHRRKYPDCYYNL 
GRLYADLNRH VDALNAWRNATVLKPEHS LAWNNM 1 1 LLDNTGNL 
AQAEAVGREALELIPNDHSLMFSLANVLGKSQKYKESEALFLKA 
I KAK PNAASYHGNLAVLYHRWGHLDLAKKHYEI SLQLDPTASGT 
KENYGLLRRKLELMOKKAV 


6367 


287 


1934 


SIGFPVMLVLSILLYTCEMFQDSVAFEDVAVSFTOEEWALLDPS 
OKNLYRDVMOETFKNLTSVGKTWKVONIEDEYKNPRRNLSLMRE 
KLCESKESHHCGESFNQIADDMLNRKTLPGITPCESSVCGEVGT 
GHSSLNTHIRADTGHKSSBYQEYGENPYRNKECKKAFSYLDSFQ 
SHDKACTKEKFYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CGKAFYFLNLCLIHERIHTGVKPYKCKQCGXAFTRSTTLPVHER 
THTG VN AD ECKE CGN A FS FPS E I RRH KRS HTGEKP Y ECKQCGK V 
FISF£ S I QYliKMTHTGEKPYECKQCGKAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTKTEDKPYGCKQCGKGFRCA 
SQLOIHERTHSGEKPHECKECGKVFKYFSSLRIHERTHTGEKPH 
ECKQCGKAFR YFSSLHI HERTKTGDKPYECKVCG KAFTCSSS I R 
YHERTHTGEKPYECKHCGKAFISNYIRYHERTHTGEKPYQCKQC 
GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RP VPAKLN PR S WPRTAGALPLRP PPLTMAVFHDEVE I EDFQYDS 
DS2TYFYPCPCGDNFSITKBDLENGEDVATCPSCSLI I KVIYDK 
DOFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCR DTR F PTPRGPGS LCHN FCR S AACTVTRT I HG S P R E DTGT " 

PRSREMMFODSVAFEDVAVSFTQEEWALLDPSOKNLYRDVMQET 

FKNLTSVGKTWKVON1EDEYKNPRRNLSLMREKLCESXESHHCG 

ESFNOIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 

TGHKSSEYOEYGENPYRNKECKKAFSYLDSFOSHDKACTKEKPY 

DGKECTETFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 

IHER1HTGVKPYKCK0CGKAFTRSTTLPVHERTHTGVNADSCKE 

CGKAFSFPSEIRRHKRSHTGEKPYECKOCGKVFISFSSIQYHKM 

THTG E KP Y EC KQCGKA FRCX3SHLQKHG RTHTGE KP YE CRO CGXA 

FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCAS0LQ1HERTHSG 

EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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SEQ 
3D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
secuence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(/UAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ?=Phenylaianine , G=Glycine # 
H=Histidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methicnine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
V/= Tryptophan, Y=Tyrosine, >U Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SS LHIHERTHTGDKPYECKVCGKAFTCSSS I RYKERTHTGEKPY 
ECKHCGKAFISNYIRYKSRTHTGEKPYQCKQCGKAFIRASSCRE 
KERTHT3NR 


6370 


1711 


329 


FVLSEQRLRTERTWPRSPGLGRGAAAAGARTAGAGLLRLLLGCG 
ALVGGLRPVTMTTPANAONASKTWELSLYELHRTPOEAIMDGTE 
1AVSPRSLHSELKCPICLDKLKKTMTTKECLHRFCSDCIVTALR 
5GNKECPTCRKKLVSKKSLRPDPNFDALISKIYPSREEYEAHQD 
RVLI RLSRLHNQQALS SSI EEGLRMQAMKRAQRVRRP j PGSDQT 
TTM S GGEG E PGEGEGDG E D VS S D S A P D£ APG P A P KR PRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGFPSPPGAPS 
P PE PGGEI ELVFRPHPLLVEKGEYCOTR YVKTTGNATVDHLSKY 
LALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIY3AFGGGAFTTLNGSLTLELVNE 
KFW KVSR PLELCYAPT KDPK 


6371 


3 


288 


GVAJWSTAMNFGTKSFOPRPPDKGSFPLDHLGECKSFKEKrMKC 
L KNNNFEN ALCRKE S KE YL ECRME R X LMLQE P LE KLG FG DIiTSG 
KSEAKK 


6372 


2141 


625 


K VS A I ASEG KAE ER YKKLEDLLEKS FS LVKM P SLQPWMCVMKH 
LPKVPEKKLKbVMADKELYRACAVEVRRQIWODNOAIiFGDEVSP 
LLKQYILEKESALFSTELSVliHNFFSPSPKTRROGEVVQRLTRM 
VG >0^/KLYDMVLQFLRTLFLRTRN VH Y CTLRAELLMSLHDLDVG 
E 3 CT^/DPCHKFTWCLDACI RERF VDS KRARELQGFLDGVKKGQE 
OVLGDLSMILCDPFAINTLALSTVRHL0ELVG0ETLPRDSPDLL 
LLLRLLALG QGAWDKI D S Q V FKE P KM E VEL 1 TR FLPMLMS FLVD 
DYTFNVDOKLPAEEKAPVSYPNTLPESFTKFLOEQRMACEVGLY 
YVLKITKQRNKNALLRLLPGLVETFGDLAFGDIFLHLLTGNLAL 
LADEFALEDFCS SLFDG FFKTASPR KENVHRHALR LLIHLHPR V 
AFSKLEALOKAiEPTGOSGEAVKELYSOLGEKLEOLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PS RAARAS PARLPAMVS W IIS RLWL I FGTLYPAYYSYKAVKSK 
DIKEYVKWMMYWIIFALFTTAETFTDI FLCWFPFYYELK1AFVA 
WL hB PYTKGSSLLYRKFVH PTLS S KEKE IDDCLVQAKDRS YDAL 
VH FGKRGLNVAATAAVMAAS KGQGALS ERLRS FS MODLTT I RGD 
GAPAPSGPPPPGSGRASGKKGQPKWSRSASESASSSGTA 


6374 


535 


2105 


HKLFCSYISTSEFPSSTRHHSCPTKTFCNYTSSTI FLSSTRDHS 
CFTHTFCNYTSSTIFLSSTRDHSCPTHTSCNYTSSTIFLSSTRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPAELQTEGS NGXKE VLS G FQ WLEDT VLFPEGGG QPDDRG TIN 
D I S VliR VTRRGEQADH FTOTPLDPGS 0 VLVR VDWE RR FDHMQQH 
SGQHLI TAVADH LFKLKTTS W ELGR F R S AI ELDTPSMTAEQVAA 
IEOSVNEKIRDRLPVNVRELSLDDPEVEQVSGRGLPDDHAGPIR 
WNI EGVDS NMCCGTHVSNLS DLQV2 K I LGTE KGKKNRTNLI FL 
SGNRV^KWMERSHGTEKALTALLKCGAEDHVEAVKKIiQNSTKIL 
OKNKLNLbRELAVH I AHSLRNSPDWGGWI LHRKEGDS EFMN 1 1 
ANE 2 G S EST LLFLT VGDE KGG GL FLLAG PPAS VE TLG PRVAE VL 
EGKGAGKKGRFQGKATXKSRRMEAQAXLQDYISTQSAKS 


6375 


1 


1535 


AlMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGliAAWSRT 
CPGRPRRPGQQWRGPTMJjVTAYLAFVGLLASCLGLELSRCRAK 
PPGRACSNPSFLRFQLJ3FYQVTFLAI^JU)WIiQAPYLYKLYQHYY 
FLEGOIAILYVCGLASTVLFGLVASSLVDWLGRKKSCVLFSLTY 
SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWI PATFARAAPWNHVLAVVAGVAAEAVASWIGLGPVAP 
FVAAI PLLALAGALALRNWGEN YDRCRAFSRTCAGGLRCLLS DR 
RVLLLGTIQALFESVI FI FVFLWTPVLDPHGAPLGII FSSFKAA 
SLLGSSLYRIATSKRYHLQPMHLLSLAVLIWFSLFMLTFSTSP 
GOESPVESFIAFLLIELACGLYFPSMSFLRRKVIPETEQAGVLN 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=. 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=lsol eucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutsmine, JUArginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, ^Unknown, *=Stop 
Codon, /=poscible nucleotide deletion, 
\=possible nucleotide insertion) 








WFRVPLHSLACLGLLVLHDS DRKTGTRNMFS I CSAVMVMALLAV 
VGLFTWRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


ISSTD1DHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSK0EGRQ 
QDLLIAALGMKLGSPKSSVTIWOPLKLFAYSQLTSLVRRATLKE 
KEQIPKYEKIHNFKVHTFRGPHWCEYCANFMWGLIAQGVKCADC 
GLNVHKQCS KMVPNDCKPDLKH VKKVY S CDLTTLVKAHTTKRPM 
WDMCIRElESRGLNSEGLYRVSGFSDblEDVKMAFDRDGEKAD 
I S VNMYEDINI I TGALKLYFRDLPI PL I T YDA YPKF I ES AK 1 MD 
P DEQL ETLH E AL KLLP PAH CETLR YLMAH LKR VTLHE KENLMN A 
ENLGIVFGPTLKRSPELDAMAALNDIRYQRLWELLIKNED1LF 


63-71 


2311 




SR1RRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE 
QRVEDVRLI RE0HPTKIPV3 IERYKGEKOLPVLDKTKFLVPDHV 
NM S EL I K 1 1 R R R LQLNANQ AF FLLVNGK S MVS VST P 1 S EVY E SE 
KDEDGFLYMVYASQETFGMKLSV 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRRNKAVA 
DLALIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAEYHAD3YDXVSGDMQK0GCDCECLGGGRISHQS0DKKIHVYG 
YSMAYGPAQHA1 STEKIKAKYPDYEVTWANDGY 


6379 


35 


378 


ERAGSPSPSRAALRRCAPORSQAPRWPDRAACRRSFQGSOGRAY 
LFN SWKVGCG P AEERVLLTGLKAV AD I YCENCKTTLG WKY EHA 
FES SQKYKEGKYI I ELAHMI KDNGWD 


• 638C 


1414 


462 


PAVQGQRGAGP PTGRGSGNMAR FALT WRHGE TRFNKEKI I QGQ 
G VOE P LSETG F KOAAAAG I FLNWVK FTHAFS S DLMRTKQTMHG I 
LERSKFCKDMTVKYDS RLRER KYG WEGKALS ELRAMAKAAREE 
CPVFTPPGGETLDQVKMRGI D FFE FLCQL1 LKEADOKEOFS0GS 
PSNCLETSLAEI FPLGKtaHSSKVNSDSGI PGLAAS VLWSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELMSVTPNTGMSLFIINFEEG 
REVKPTVQC1 CMNLQDHLNGLTENS LGLNLPS KSNH FE PLKG VP 
LALFTSLLC 


6381 


1666 


216 


AWRAQGSRGFSGAGWRPRQAAAMNFSEVFKLSSLLCKFSPDGK 
YLASCVQYRLWRDVNTLO ILQLYTCLDOIQH I EWSADSLF ILC 
AMYKRGLVQVWS LEOPEWHCKI DEG SAGLVASCWS PDGRHI LNT 
TEFHLR I TVWSLCTKS VS Y I K Y PKACLQGITFTRDGR YMALAER 
RDCKDYVS IFVCSDWQLLRHFDTDTQDLTGI EWAPNGCVLAWJD 
TCLEYKI LLYS LDGRLLSTYS A YE WS LGI KS VAWS PSSQFLAVG 
S YDGKVRI LNHVTWKM2 TEFGHPAAI ND PKI WYKEAEKS POLG 
LGCLSFPPPRAGAGPLPSSESKYEIASVPVSLQTLKPVTDRANP 
XI Gl GMLAFS PDS Y FLATRNDNI FNAVWVWD1 QKLRLFAVLEOL 
S PVRAFQWDPQQ PRLAI CTGGSRLYLWS PAGCMS VQVPGEGDFA 
VLSLCWHLSGDS MALLS KDHFCLCFLETEAVVGTACRQLGGHT 


6382 


2 


1062 


FEEDEDRNLCL1AYPLKGDHGIVD1VDNSDCEPKSKLLRWTTNK 
KKHVLETE KTPKD WVROH R K E E KM KSK KLE EE FEWLK KS E VLYY 
TVEKKGNISSQLKHYNPWSMKCHQQQLQRMKENAKHRNQYKFIL 
LENLTSRYEVPCVLDLKMGTRQHGDDASESKAANQIRKCQQSTS 
AVI GVR VCGMQVYQAGSGOLM FMNK YHGRKLS VQGFKEALFOFF 
HNGRYLRRELLGPVLKKLTELKAVLEROESYRFYSSSLLVIYDG 
KERPEWLDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 
Tr>PAHT , pr , PT.v^:FnT\7VHPr:n'nAf;YT FfiLOSLIDIVTEISEESG 
E 


6383 


3159 


1061 


SPAPGRPSPHGSOPAARAAAAPAMPSAXQRGSKGGHGAASPSEK 
GAHPSAARPLAAPTPAAPACRS PS PGGAPAS FPGRAPRSLASQP 
AARAAAAPAMPSAKQRGSKGGHGAAS PS E KG AHPSGGADDVAXK 
PPPAPQQPPPPPAPHPQQHPQOHPONQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
GWCVHHVLE EVQQ VRRSHQDFS RQR EE LGQGLQG VEQKVQS LQA 
TFGTFESILRSSOHKQDLTE KAVKOGESEVSR I SEVLOKLQNE1 



496 



BNSDOCID: <WO_0153312A1_I_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleoti de- 
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Amino acid segment containing signal peptide 
(A«=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKDLSDGIHWKDARERDFTSLENTVEERLTELTKSINDNIAIF 
TEVQKRSQXEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 
AKSREWDMEALRSTLQTMESD I YTEVRELVS LXQEQQAFK E AAD 
TERLALOALTEKLLRSEESVSRLPEEIRRLEEELRQLKSDSHGP 
KEDGGFRHSEAFEALQOKSOGLDSRLQHVEDGVLSMQVASAROT 
ESLESLLSKSQEKEQRLAALOGRLEGLGSSEADODGLASTVRSL 
GETOLVLYGDVEELKRSVGELPSTVESLQKVQEQVHTLLSQDQA 
QAARLPPODFLDRLSSLDNLKASVSQVEADLKMLRTAVDSLVAY 
SVKIETNENNLESAKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 


73 8 


19CK 


IWEVPVCLTHLLHLQQANQPLPPPSSSINEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKJDPNEPQXPVSAYALFFRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKAliA 
AYRASLVS KAAAESAE AQTI RS VQQTLASTNLTS S LLLNTPLSO 
HGTVSASPQTLQQSLPRSIAPKPLTMRLPMNQIVTSVTIAANMP 
SNIGAPL3SSMGTT^WGSAPST0VSPSVQTQQHQMQLQQQQQQQ 
QQOMQQM0OOQLQQHQMH0Q I QQQMQCKJKFQHHMQQHLQQQQOH 
LQQQINOQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVAS01 
TSPIPAIGSPQPASQ0HQSQ1QS0TQTQVLSQVSIF 


6385 


2 


1584 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGESLSGTRES 
LAQGPDAATTDELSSLGSDSEANGFAERRIDKFGFIVGSOGAEG 
ALEEVPLEVLRQRESKWLDMLNNV?DKWMAKKHKKIRLRCQK(4T.P 
PSLRGRAWQ YLSGGKVKLQQNPG KFDELDMS PGDP KWLDV I ERD 
LHROFPFHEMFVSRGGHGQODLFRVLKAYTLYRPEEGYCQAOAP 
IAAVLLMKMPAEQAFWCLVQICEKYLPGYYSEKLEA1QLDGEIL 
FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDMFFCEGVKI I FRVGLVLLKHALGSPEKVKACOGQYETI ER 
LRSLSPKIMQEAFLVQEWELPVTERQIEREHLIQLRRWQETRG 
ELQCRSPPRLHGAKAILDAEPGPRPALOPSPS 1 RLPLDAPLPGS 
KAKPKP PKQAQKSQRKOMKGRGOLEKP PAPNQAMWAAAGDACP 
PQHVPPKDSAPKDSAP0DLAP0VSAHHRSQESLTSOESEDTYL 


6386 


819 


19b 


TVCGSFYLGIMORASRLKRELKKLATEPPPGITCWODKDOMDDL 
RAQ I LGG ANTP Y E KGV FKLEVI I PER YP FEP PQ I RFLT P I YH PN 
IDSAGRICLDVLKLPPKGAWRPSLNIATVLTSIQLLMSEPNPDD 
PLMADI S SEFKYN KPAFLKN ARQWTEKHARQKQKADEEEMLDNL 
PEAGDSRVHNSTQKRKASQLVG1EKKFHPDV j 


6387 


1 


662 


PGPTHASADAWADAWAQPNr4AMHNKAAPPQIPDTRRELAELVKR 
KQELAETLANLERQ I YAFEGSYLEDTQMYGN1 I RGWDRYLTNQK 
NSNSKNDRRN R KFK E AERLF S K S S VTS AAAVS ALAG VQDQL I E K 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
S TS SGS HHS SH KKR KN KNRH S P SGMFD YD FE I DLKLNKKPRADY 


6388 


1 


662 


PGPTHASADAWADAWAOPNMAMHNKAAPPQIPDTRRELAELVKR 
KQELAETLANLERQI YAFEGSY LEDTQMYGNI I RGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPOKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6383 


1074 


497 


AEPGDRMAGHRLVLVLGDLH I PHRCNSLPAKFKKLLVPGK1 OH I 
LCTGNLCTKESYDYLKTLAGDVHIVRGDFDENLNYPEQKWTVG 

APVTrT t vtr^urwrr nuirr^M > c t at t ncni?n\rm 7 .T cmjnrtTVPtr a Tr 
yr K.J.wJjiHt3nUVJ. V WuJJrl/io JUft uLi^JriiJ rVvV 1 ±t J. cun I WArr-Ar 

EHENKFY1NPGSATGAYNALETNIIPSFVLMDIQASTWTYVYQ 

LIGDDVKVERIEYKKP 


6330 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWOGSGSHG 
LTIAQRDDGVFVQEVTONSPAARTGWKEGDQIVGATIYFDNLQ 
SGEVTQLLNTMGHHTVGLKLHRKGDRFFPSLGQTKDP 


6331 


5386 


2897 


VRWNSKTECYLSIQTQENFPANLNELVNC1VISSLVTTQRKLKA 
MSLLGSRNOLARAVLNPNPMDFCTKDLLTTTS ER I IAYLRDFNE 
DQKKAIETAYAMVKHSPSVAKICIilHGPPGTGKSKTIVGLLYRL 
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ID 
NO: 


Predicted 
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to first 
amino acid 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Al anine , Cs=Cysteine, D^Asparfcic Acid, E= 
Glutamic Acid, F= Phenylalanine, Gs=Glycine, 
H-Histidine, I=lsoleucine, K=Lysane, 
L=ljeucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glut amine, R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
W tryptophan, Y=Tyrosine, X Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








LTENQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKIILEF 
KEKCKDKKNPLGNCGDINLVRLGPEKSINSHVLKFSLDSOVKHR 
MKKELPSHVOAMHKRKEFLDYQLDELSRQRALCRGGREIOROEL 
DEN I S KVS XERQEIAS KI KEVQGRPQKTQS 1 1 1 LESH 1 1 CCTLS 
TSGGLLLESAFRGQGGVPFS CV I VDEAGQS CE JETLTPLI HR CN 
KLI LVGDPKQLPPTVI SMKAQEYG YDQSMMARFCRLLEENVEHN 
MISRLPILQLTVOVRMHPDICLFPSNYVYNRNLKTNRQTEAIRC 

KRKDVSFRNIGI1THY KAQKTM I QKDLDKE FDRKG P AE VDT VD A 
FOGR0KDCVI VTCVRANS IOCS IGFLASLQRLNVTITRAKYSLF 
ILGHLRTLMENOHWNQLIODAQKRGAI I KTCDKNYRHDAVKI LK 
LKPVLORSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDSKE ITLTVTS KDPERPPVHDQLQDPRLLKRMG I EVKGG 

LS S H K P P VRG E P F AA S P E ASTCQS KCDDP EEELCHRREARAF SE 
GEQEKCGSETHHTRRNSRTOKRTLEQEDSSSKKRKLL 


6392 


972 


186 


GRTGVDLAS SMAHRLQ I RLLTWDVKDTLLRLRH PLGEAYATKAR 
AHGLEVEPS ALEQG FR 0 A YRAQS HSFPN YG LSHGLTS RQ W WLDV 
VLOTFHLAGVQDAOAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 
ECRTRGLRLAVISKFDRRLEGILGGLGLREHFDFVLTSEAAGWP 
KPCPRIFOEALRLAHWEPWAAHVGDNYLCDYQGPRAVGMHSFL 
WGPQALDPWRDSVpKEHILPSLAHLLPALDCLEGSTPGL 


6393 


2017 


730 


TGGS KMAAVATCGS VAASTGSAVATAS KSNVTS FQRRGPRAS VT 
ND S GP RLV S I AGTR PS VRNGOLLVS TGL PALDQLLGGGL AVGTV 
LLI EEDKYNI YS PLLFKYFLAEGI VNGHTLLVASAKEDPANILO 
ELPAPLLDDKCKKEFDEDVYNHKTPESN1KMKIAWRYOLLPKME 
IGPVSSSRFGHYYDASKRKPQELIEASNWHGFFLPEKISSTLKV 
EPCSLTPGY T1CLLQF 1 ONI I YEEG FDGSNPQKKQRN1LR 1 GION 
LGSPLWGDD1 CCAENGGNSHSLTKFLYVLRGLLRTSLSACI ITK 
PTHli 1 QNKA 1 1 ARVTTLS DWVGLESF I GSERETNPLYKDY HGL 

TUTOrtTDDI WTvTT T/'VUTOTM/fcTlT ZV 1? VT . VD VT CTTPOl LIT 13 Tin? On 

TVSRSSKMDLAESAKRLGPGCGMMAGGKKHLDF 


6394 


1418 


511 


GAAAGGEGARRRPAAMATVMAATAAERAVLEEE FRWLLHDE VHA 
VLKQLQDILKEASLRFTLPGSGTEGPAKQENFILGSCGTDQVKG 
VLTLQGDALSQ AJDVNLKM P RNNQLLHFAFREDKQWKZjQQ I QDAR 

TPATLTLPE 1 AASGLTRWFAPAIiPSDLLVNVY INLNKLCLTVYQ 
LHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRliEVSHVHKVEC 
VIPWLNDALVYFTVSIjOLCOQLKDKISVFSSYWSYRPF 


6395 


13 


65fc 


PSGRPTH PLCCAAR RGAARHGGS VSGW PAGRTPTETSN PGS S VM 
ESVTFEDVAVEF I QEWALLDS ARRSLCKYRMLDQCRTLASRGTP 
PCKPSCVSOLGORAEPKATERGILRATGVAWESQLKPEELPSMQ 
DLLEEASSRDMOMGPGLFLRMQLVPS I EERETPLTREDRPALQE 
PPWSLGCTGLKAAMQIQRWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


ANILSSPSKRGOKGTLIGYSPEGTPLYNFMGDAFQHSSOSIPRF 
I KESLKO I LEE SDS RQI F YFLCltNLLFT F VELFYGVLTNSLGLI 
SDGFHMLFDCSALWGLFAALMSRWKATRI FSYGYGRI EILSGF 
ING1>FLIV I AF FV F WES VAR1* 1 DPPELDTHMLTPVS VGGL1 VN L 
IGI CA FS HAHSHAHG ASOG SCHS SDHSHS HHMHGHS DHGHGHS H 
GSAGGGMNANMRGVFLHVLADTLGS 1GV I VSTVL I EQFGWF IAD 
PLCSLFIAILIFL5WPLIKDACQVLLLRLPPEYEKELHIALEK 
IQKIEGLISYRDPHFWRHSASIVAGTIHIQVTSDVLEQRIVQQV 
TGILKDAGVNNLT1QV2KEAYFQHMSGLSTGFHDVLAMTKQMES 
MKYCKDGTYIM 


6397 


391 


■ 122 


GAGGVGRFEAIRAPARMIEWCNDRLGKKVRVKCNTDDTIGDLK 
KLIAAQTGTRWNKI VIjKKWYTI FKDHVSLGDYEIHDGMNLEbYY 
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Amino scid segment containing signal peptide 
(A=Alsnine, C=Cysteine, D=Aspartic Acid, E= 

n't > 1 1" arr. "i f A/** "i P O W*=> n\/*l a 1 a n<» fl — \i f ■{ no 

\j3±. uucHiiiL i\cici / r — triienyxci-ciTiirie , w-ui yciiie, 
FUHistidine, lUlsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, 0= Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








V 


6398 


353 


1306 


HKQMGPLlNRCKKIbLPTTN/PPATMRIWLLGGLLPFLLLLSGLO 
RPTEGSEVAIKIDPDPAPGSFDD0YQGCSKQVMEKLTOGDYFTK 
DIEAQKNYFRMWOKAHLAWbNQGKVLPQNMTTTHAVAILFYTLK 
SNVH SDFTRAMAS VARTPQQYERS FHFKYLHY YLTS A I QLbRKC 
SIMENGTLCYEVHYRTKDVHFNAYTGATIRFGQFLSTSLLKEEA 
OEFGNQTLFTlFTCLGAPVQyFSLKKEVLIPPYELFKVJNMSYK- 
PR G DW LQ LR STGN LS T YNCQLLKAS SKKCIFDPIAIASLSF LTS 
V1IFSKSRV 


6339 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRXRTWSKRGVAVSGPrK 
RRGMADS LESTPLPSPEDRLAKLHPS KELLEYYQKKMAECEAEN 
EDLLKKLELYKEACEGOHKLECDLOQRSEEIAELQKAbSDMQVC 
LFQEREHVLRLYSENDRLRIRELEDKKKIQNLLALVGTDAGEVT 
YFCKEPPHKVT3LQKTIQAVGECE0SESSAFKADPKISKRRPSR 
ER KES S EHYCRDIQTLIL0VEAL0A0LGEQTKLSRE03 EGblED 
RR1HLEE1OV0HORNONKIKELTKNLHHTQELLYESTKDFLQLR 
SEN0NKEKSWMLEKDNLWSKIKQYRVQCKKKEDK1GKVLPVMHE 
SHHAQSEY1 KVMSLCRNEWYFSGRVEGIPKNLQFVV* 


6400 


2520 


1053 


KTKKCDEWYEVOSAILRHNCGYAMKTGXFFHNLMERKDFETWL 
DNISVTFLSLTDLQKNETLDIILISL-SGAVQLRKLSNNLETLLKR 
DFLKLLPLEbSFYLLKWLDPQTLLTCCLVSKOWNKVISACTEVW 
QTACKNLGWQIDDSVODALHWKKVYLKAILRMKQLEDHEAFETS 
SL1GHSARVYALYY KDGLLCTGSDDLSAKLWDVSTGQCVYG I QT 
HTCAAVK FDEQ KL VTGSFDNT VACWE W S S G ARTQH FRG HTG AVF 
SVDYKDELDILVSGSADFTVKW3ALSAGTCLNTLTGHTEWVTKV 
VLQKCKVKSLLHS PGDYILLSADKYEI KI WPIGREI NCKCLKTL 
SVSEDRS 1CLQPRLHFDGKY1VCSSALGLYQWDFASYDILRVI K 
TPE I ANLALLGFGDI FALLFDNRYLY I MDLRTESLI SRWPbPEY 
RKSKRGSS FLAGEASWLNGbDGHNDTGLVFATSKPDHS I HLVLW 
KSHG 


6401 


109 


766 


PGA?»WSRPDLRGCCTGPQPALRMLVLPSPCPQP1AFSSVETMEG 
PPRRTCRSPEPGPSSSIGSP0ASSPPRPNHYLLIDT0GVPYTV1, 
VDEESQREPGASGAPGQKKCYSCPVCSRVFEYMSYLQRHS1THS 
EVKFFECDICG KAFKRASHLARHHS I HLAG GG R PHG CPLC PRRF 
RDAGELAQRSRVHSGERPFOCPHCPRRFMEQNTIUQKHTRWKHP 


6402 


1196 


279 


TTSOCGGIRQSSAIPVASMEFAAICLRNALLliLPEEOQDPKCEN 
GAXNSNQLGGNTESSESSETCSSKSHDGDKFIPAPPSSPLRKQE 
LENLKCS I LACSAYVALALGDNLMALNHADKLLCX3PKLSGSLKF 
LGHLY AAEAbI SLDR 1 SDAITHLNPENVTDVSLGI SSNEQDQGS 
DKGENEAMESSGKRAPOCYPSSVNSARTVMLFNLGSAYCLRSEY 
DKARKCLHO AASMIKPKEVPPEAI LLAVYLELQNGNTQLALQ 1 1 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6403 


2 


1690 


RG1 HTSVLOGNbONQMYSHNWIMNliNNLNLTOVQQRNLI TNLQ 
RSVDDTSQA I QRIKNDFONLQQVFLOAKKDTDWLKEKVQSLQTL 
AANI^S ALAKAI WDTLEDMNSOLNS FTGQMENI TTI SQANEOMLK 
DLODLHKDAENRTA I KFNOLEERFOLFETDI VNI I SN1 S YTAHH 

MQQDLMRSRLDTEVAl^SVIKiEEMKLVDSKHGQLI KNFT 1 LQGP 
PGPRGPRGDRGSOGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEG L PGPQG P PG FQGLQGTVGE PGVPG PRGLPGLPG V PGM PGP \ 
KGPPGPPGPSGAWPLALONEPTPAPEDNSCPPHWKNFTDKCYY 
FSYEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 
WI G LT DS ER EN E WKWLDGTS ?DY KNVJ KAGQPDN WGHGHGPGEDC 
AGL I YAGQWNDFQCEDVNNFI CEKDRETVLSS AL 


6404 


1012 


222 


AAAIJ s J4AAFAPGLISVFSSS0EI/7AAliAQLVAQRAACCLiAGARA 
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SEQ 
ID 
NO: 


Preoictec 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(As Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine , 
H^Histidine, s =Isoleucine , K=Lysine, 
L- Leu cine, M=Methionine , N=Asparagine, 
P=Froline, Q=Giutamine, R-Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrocine, X=Unkncwn, *=Stcp 
Codon, /^possible nucleotide deletion, 
Vpcssibie nucleotide insertion) 









RF AbGbSGGS LVS KLARELPAAV APAGPASLARViTLGFCD ER LV 
PFDKAESTYGLYR7HLLSRLPIPESQVITINPELPVEEAAEDYA 
KKLRCAFQGDS I PVFDLLILGVGPDGHTCSLFPDHPLLOEREKI 
VAPISDSPKPPPORVTLTLPVLKAARTVIFVATGEGKAAVLKRI , 
LEDQE EN P LP AALVQ PHTGKLCW FLDEAAARLLT VP FEKHS PL 


6405 


1 


1456 


AAL PR PTPRAPLGR EGTGS DS EMAASM F YGRLVAV ATLRNKR P R 
TAORAAAQVLGSSGLFNNHGLQVOQQQQRNLSLHEYMSMELLOE 
AG VS V P KG Y VA KS P DE A YA I AK KLGS KD W I KAQ VLAGG RG KGT 
FESGLKGGVKIVFSPEEAKAVSSOMIGKKLFTKQTGEKGRICNQ 
VLVCERKYPRREYYFAITMERSFOGPVLIGSSHGGVNIEDVAAE 
TPEA1 IKEPID2EEGI KKEOALOLAQKMGFPPNIVESAAENKVK 
LYSLFLKYDATMIE 'NPMVEDSDGAVLCKDAKINFDSNSAYROK 
KirDLQDWTOEDERDKDAAKANLNYIGLDGNIGCLVNGAGLAMA 
TMDI3 KLHGGTPAK FLDVGGGATVHQVTEAFXLITSDKKVLA3 L 
VN j r GG 3 MR CDVIAQG I VMAVKDLEI Kl PVWRLOGTRVDDAKA 

LI adsglk: lacddldeaarmw klsei vtlakqahvdvkfolp 

I 


6406 


1036 


167 


hpromrgedtpeappyssgrydsiktevsgcpsdltvgraptad 
pddddhddhedndkwitdsegmdperlkafnmfvrll^dekldrm 
vp] gkqpkekioai 3escsrqfpefqerarkrirtylkscrrmk 
kng kiemtr ptp phlts amaeni laaaces etrkaakrmrle i yq 
sscdepialdkohsrdsaaithstyslpassysqdpvyanggln 
ys v rg yg als s nlg p p aslqtgnh s ng esge aralas r pa ps wv 
craalgsgngrgkcr pvmergclta 


6407 


492 


1 150 


VGLCLAVSOTVLAOLDALLVFPGOVAQLSCTLS PQHVTI RDYGV 
SWY QQRAGSAP RYLLY YRSEEDKHRPAD1 PDRFSAAKDEAJ 1RAC 
VLC1 SPVQPEDDADYYCSVGYGFSP 


6408 


1458 


903 


RGC; tssqav;rlfggvtrgfnmri EKCYFCSGPIYPGHGMMFVR 
NDOCVFRFCKSKCHKNFKKKRKPRKVRWTKAFRKAAGKELTVDN 
SFEPEKRRNEPIKYGRELWNKTIDAMKRVEEIKQKRQAKFIMNR 
LKJCN K ELQKV0D I KE VKQNI HLI RAPLAGKGKOLEEKMVQOLOE 
DVDKEDAP 


6409 


150 


446 


KT7iL ANLL R C FTCDR L CGG CTAPA P PAHQG I VLQP VMPS CDPG P 
GPACLPTKTFRSYLPRCHRTYSC\ / HCRAHLAKHDELISKSFOGS 
HGRAYLFNSV 


6410 


85 


607 


RGG7AGCVACLGCWG0SSSPKAAFPAGSACLPADSCPCLLF0AC 
AISGLFNC1 71 HPLN1AAGVWMIMNAFILLLCEAPFCCQFIEFA 
NTVAE KVDRLR S WQKAVF Y CGMAW PI VI SLTLTTLLGNAI A FA 
TGVI Y GLSALG KKGDAI S YARIQOQRQOADEEKLASTLEGEL 


6411 


302 


772 


rls3masslnedpegsrityvkgdlfacpktdslahcisedcrm 
gag:«avlfkkkfggvqellnqqkksgevavlkrdgryiyylitk 
krashkptyenlqksleamkskclkngvtdlsmprigcgldrlq 
wenvsamieevfeatdikitvytl 


6412 


61 


1709 


RPVTS FS PLPG S CGG RLGTRTMLG RS LR E VS AALKQGQ I T PTSL 

cqkclsl i kktkflnayi tvs eevalkqaees ekrykngqslgd 
ux:>:piavkdnfst£giettcasnmlkgyippynatvvqklldq 
gallmgktnldefamgsgstdgvfgpvknpwsyskqyrekrkqn 

PHS EN EDS DWL I TGG S SGGS AAAVS AFTCYAALGS DTGGS TRN P 
AAK CG LVGFK PS YGL VSRHGLI PLVNS MDVPGI LTRCVDDAAI V 
LGALAG PDPR D S TTVHE PINKPFMLPS LADVS KLC I G I P KE YLV 
PELSS E VQSLWS KAADLFESEGAK VI EVSLPHTS YS I VCYHVLC 
TS EVA S NMAR FD G L£ YGHRCD I DVSTEAMY AATRR EGFND WRG 
RlLSGNFFLLKEl^ENYFVKAQKVRRLIANDFVNAFNSGVDVLL 
TPT7LSEAVFYLEF I KEDNRTRSAODDI FTQAVN MAGLPAVS I P 
VAL.SNOGLPIGL0FIGRAFCPQ0LLTVAKWFEKQV0FPV30L0E 
LMDDCSAVLENEKLASVSLKQ 



^00 
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SEQ 
3D 
NO: 


Predictec 
beginninc 
nucleotide 
loca t i on 
correspcnding 
to first 
amino acid 
residue of 
ammo acjc 
sequence 


Predicted enc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K^Lysine, 
L^Leucme, M=Methionine, N=Asparagine , 
PsProline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyro3ine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=?ossible nucleotide insertion) 


6413 


2 


885 


HEPRCAG WAASLWMGDLEPYMDENFI SRA FATMGET VMS VKI I R 
NRLTG I PAG YCFVEFADLATAE KCLHK I NGKPLPGATF AKRFKL 
NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFFVKVYPSCRG 
G K WLDO TG VS KG YG F VKFTDE L EQKRALTE CQG AVG LG S KPVR 
LSVAIPKASRVKPVEYSQMYSYSYNQYY00YONYYAOWGYDONT 
GS YSYSYPQYGYTOSTMOTYEEVGDDALE DP MPQ1.DVTEANKEF 
MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAAIrLPWRRFPCCR PRPQPARPSSRATPGPRSPGMATSIGV 
S FSVGDGV PE AEKNAGE PENT Y I LRPVFQQR FRPS WKDCI HAV 
LKEELAI^AEYSPEEMPQLTKHLSENIKDKLKEMGFDRYKMWQV 
VIGEQRGEGVFMASRCFWDADTDNYTHDVFI4NDSLFCVVAAFGC 
FYY 


6415 


2 


1166 


FVRQWQSS HRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP 
F I E AAVAG LGAGS G KRRRGW KM P VHSRG D K K ETNHH DEM E VD YA 
ENEGSSSEDEDTESSSVSEDGDSSEMDDEDCERRRMECLDEMSN 
LEKQFTDL K DQLY KERLSQVDAKLQEV1 AG KAPEYLE PLATLQE 
NMQIRTKVAGIYRELCLESVKNKYECEIOASRQHCESEKLLLYD 
TVQSELEEKIRRLEEDRHSID1TSELWNDELQSRKKRKDPFWPD 
KKKPGWSGPYIVYMLQDLDILEDWTT1RKJJ4ATLGPHRVKTEP 
PVKLEKHLHSARSEEGRLYYDGEWYIRGOTICIDKKDECPTSAV 
ITTINHDEVWFKRFDGSKSKLYISOLOKGKYSIKHS 


6416 


410 


1519 


EI APADLE I PACAPVLLS RATSSTMS VTGG KMAPSLTQE I LSHL 
GLASKTAAWGTLGTLRTFLNFSVDKDAORLLRAITG0GVDRSAI 
VDVLTNRS REQRQL I SRNFQERTQQDLMKS LQAALSGNLERI VM 
ALLOPTAQFDAQELRTALKASDSAVDVAI E I LATRT PPQLQECL 
AVY KHNFO VEAVDG ITS ETSG I LQDLLLALAKGGRDS YSGI I D Y 
NLAE0DV0ALQRAEGPSREETWVPVFTCKNPEHLIRVFDOYORS 
TGOELEEAVONRFHGDAQVALLGIiASVIXNTPLYFADKLHQALQ 
ETEPNYOVLIRILISRCETDLLSIRAEFRKKFGKSLYSSLQDAV 
KGDCOSALLALCRAEDM 


6417 


I 




RGESR VLWS ELEGEAGGAGGWAS S LNARMDNRFATAFV I ACVLS 
L I ST I YMAAS I GTDFWYE YRS P VQENSSDLNKS I WDEFI S DEAD 
EKTYl-IDALFRYNGTVGLWRRCITIPKNMlIVsYSPPERTESFDWT 
KCVSFTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL 
GLMCFGAL-3 G1»CACI CRSLYPTI ATG I LHLLAGLCTTjGS VSCYV i 
AG I EL LHQKLE LPDNVSGE FGWS FCLACVSAPLQFMASALFI WA 
AKTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTS TPAGAS PAAAYQADPPPPAH 
TPAPPPPFPCGGIACHGEPAKFYGYDNLQROPIFTTOOEAELVQ 
YFDCKSSSGNIGEDPDHLNQSSSPSQMFFWMRPQAAPGRRRGRQ 
TYSRFOTI.ELEKEFLFNPYLTRKRRIEVSHALALTEROVKIWFQ 
NRKMKWrCKLNNKDKr PVSRQEVKDGbl JvKbAQELEEDKAhOJbTN 


6419 


1 


973 


PGRPRVRNFDLNSKSILQEFFCTRSIQIPAflRSKTAMSKCPIFP 
MARS1STSGPLDKEDTGR0KLISTGSLPATL0GATDSLGLEWHL 
PSPDPVTVPYLSPLWWKELESLLENEGDHAITVADFVDHHPIV 
FWNLVWYFRRLDLPSNLPGLILSSEHCNKYSKIPRHCMSEDSKY 
VL IOMLWDNMKLHQDPGQPLY I LWNAHTOKYPMVHLLQKSDNS F 
NQELLKSMVKS I KMNDVYGPMSQILETLNKCPHFKRORS LYREI 
LFLS LVALGR EN I D I DAFDKE Y KMA YDRLTPSQVKS THNCDRP P 
STGVMECRKTFGEPYL 


6420 


207 


1187 


RKMIDKNOTCGVGQDSVPYMICLIHILEEWFGVEQLEDYLNFAN 
YLLWVFTPLILL1LPYFTIFLLYLTIIFLHIYKRKNVLKEAYSH 
NLWDGAR KTVATLWDGHAAVWHGYEVHGME KI PEDGPAL 1 1 FYH 
GAI P 1 DFY YFMAK I FIHKGRTCRVVADH FV FKIPGFS LLLDVFC 
ALHGPREKCVEILRSGHLLAI SPGGVREALI SDETYNI WJGHRR 
GFACVAIDAKVPIIPMFTONIREGFRSLGGTRLFRWLYEKFRYP 
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SEO 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

ccr responding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide ; 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine. 
H=Histidine, I=Isoleucine, X=Lysine, 
L=Leucine, M=Kethionine , N=Asparag ine , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
Ha Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FAPMYGG FP VKLRTYLGDP I P Y DPQI TAEELAEKTKNAVQAL I D 
KH0R1 PGN IMSALLERFH 


6421 


1B44 


362 


WALSLRROPERWSNKLLSPHPHSWLRSEFKMASSPAVLRASRL 
YOWSLKSSAQFLGSPQLRQVGQIIRVPARMAATLILEPAGRCCW 
D E PVR I AVRG IAPEQ P VTLRAS LRDE KG AL FQAKAR YRADTLG E 
LDbERAPALGGSFAGLEPMGLLWALEPEKPLVRLVKRDVRTPLA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRG? 
LFLPPEPGPFPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLPKTMETLHLEYFEEAMNYLLSKPEVKGPGVGLLGISKGG 
ELCLSMAS FLKG1 TAAWIKGSVANVGGTLRYKGETbPPVGVNR 
WRlKVTKDGYADIVDVLNSPL,EGPDOKSFiPVERAESTFL.FLVG 
ODDHNWKSEFYANEACKRLOAHGRRKPQI I CYPETGHYIEPPYF 
PLCRAS LHALVGS P 1 1 WGGEPRAHAMAOVDAWKC LQT FFHKHLG 
GREGTIPSKV 


6422 


181 

• 


2132 


EGENLSWFQEFWGDIAKEFYWKTPCPGPFLiRYNFDVTKGKIFIE 

WMKGATTNICYNVLDRNVHEKKLGDKVAFyWEGNEPGETTQITY 

H0LLV0VCQFSNVLRKOG1 HKGDRVAI YMPM I PELWAMLACAR 

lGALHSlVFAGFSSESLCERILDSSCSbklTTDAFYRGEXLVNL 

KEIjADEAiQKCQEKGFPVRCCIVVKHL»GRAELGMGDSTSQSPPI . 

KR S C PD VQ 1SWNQG IDLWWHELMQE AGDECE P E WCDAEDPLFI L 

YTSGSTGKPKGVVHTVGGYMLYVATTFKYVFDFHAEDVFWCrAD 

3 GWI TGHSYVTYGPLANGATSVLFEG1 PTYPDVNRLWSI VDKYK 

VTKFYTAPTAIRLL.MKFGDEPVTKHSRASLQVU5TVGEPINPEA 

WLWYHRWGAQRCPIVDTFWQTETGGHMLTPLPGATPMKPGSAT 

FPFFGVAPAILNESGEELEGEAEGYLVFKOPVJPGIMRTVYGNHE 

RFETTYFKKFPGYYVTGDGCQRDQDGYYWJTGRIDDMLNVSGHL 

LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 

FS PKLTEELKKQI REKI GPIATPDYI ONAPGLPKTRSGKIMRR V 

LRKI AQNDHDLGDMSTVADPS VI SHL.FSHRCLTI Q ! 


6423 


614 


1237 


AN1/KEI PKDLP PE TVLL YLDSNQ I TS I PNE J FKDLHQLR VLNhS 
KKGI EFI DEHAFKGVAETLQTLDLSDNRIOS VHKNAFNNLKARA 
R 37aNNPWHCDCTLQQVLRSMASNHETAHNVJ CKTSVLDEHAGRP 
FLNAANDADLCNLPKKTTDYAMLVTMFGWFTMVI S YWY YVRQN 
QED ARR H t>E Y LKSL.P SRQKKADEP DD I ST W 


6424 


1 


1186 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWWAHNRMTIjKQLKDR Y PRKRFVAG VGANS K I SA 
LKGSKGKDWE2 PVPVGISVTDENGKI I GELNKENDR I LVAQGGL 

ggklltnflplkgokriihldlkljadvglvgfpnagkssllsc 
vskakpaiadyafttlkpelgkimysdfkoisvadlpgliegah 

MNKGMGH KFLKH I SRTRQLL? WDI SGFQLS SHTQYRTAFETI I 
LLTKELE LY KEELQTKPAiiLAVN KMDLPDAQDKFHELMSQliQN P 
KDFLHLFEKNM I PERTVEFQHI I PI SAVTGEG I E EljKNC I R KSL 
DEO ANQENDALH KKQLLNLW I S DTMS S TE P P S KHAVTTS KMD 1 1 


6425 


1850 


1144 


LAMEGGGGIPLETLKEESOSRHVLPASFEVNSLQKSNWGFLLTG 
L VGGTLVA VYAVATPFVTPALRKVCL P FV PATM KQ 1 ENWKMLR 
CRRG S LVD I GSGDGRI V I AAAKKGFTA VG YELNP WLVWYSR YRA 
WREGVHGSAKFY I SDLWKVTFSQYSNWI FGVPQMMLQLEKKLE 

KRPCTSMHFQLPIQA 


6426 


30 


56S 


SRoAAVGGMSVAGGEIRGDTGGEDTAAPGRFSFS PEPTLED1 RR 
LHAEFAAERDWEOFHQPRNLLLALVGEVGSLAELFQVJKTDGEPG 
PQGWSPRERAALQEELSDVLIYLVALAARCRVDLPLAVLSKMBI 
MRRRY PAH1ARSS S RKYTELPHGAI S EDOAVGPADI PCDS TGOT 
ST 


6427 


145 


95S 


AASWGPPHVPKAGKMVSWM I CRLWLVFGMLCPAYAS YKAVKTK 
NIREYVRV7MMYWI VFALFMAAEI VTDI FI SW FPFYYEI KMAFVL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nuciebtide 
location 
corresponding 
to first 
atnino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acic 
residue ot 
amino acic 
sequence 


Ammo acid segment containing signal peptide 

Glutamic Acid, F^Phenylalanine , GsGlycine, 
H=Kistidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Prcline, Q=Glut amine, R=Argir.ine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyro£ine, X=Unknown, *=Stcp 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion} 








WLLSPYTKGASLLYRKFVHPSLSRHEKEIDAYIVQAKERSYETV 
LSFGKRGLNIAASAAVQAATKSQGALAGRLRSFSMQDLRSISDA 
PAPAYKDPLYLEDQVSKRRPP1GYRAGGLQDSDTEDECWSDTEA 
VPRAPARPREKPLIRSQSLRVVKRKPPVREGTSRSLKVRTRKKT 
VPS DVDS 


6428 


1982 


4 4< 


SGSGGKMEDHQHVPIDIQTSKLLDWLVDRRKCSLKWOSLVLTIR 
EKINAAI0DMPESEE1AQLLSGSYIHYFHCLRILDLLKGTEAST 
KNI FGRYSSQRMXDWQEI IALYEKDNTYLVELSSLLVRNVNYE2 

TGENVRGELLALVKDLPSQLAEIGAAAQOSLGEAIDVYQASVGF 
VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSWERPHLEEb 
PEOVAEDAIPWGDFGVEAVSEGTDSGISAEAAGIDWGIFPESDS 
KDPGGDG I DWG DDAVALQ I T VLEAGTQ AF EG V ARG PDALTLLE Y 
TETRM0FLDELMELE1FLAQRAVELSEEADVLSVSQF0LAPA1L 
QGQT ?CE KMVTMV S VLED L 1 G K LTS LQLQH LFM 1 LAS PR YVDR VT 
EFLOOKLKOSQLLALKKELMVQKQOEALEEQAAbEPKLDLLLEK 
TKELOKLIEADISKRYSGRPVNLMGTSL 


6425 


3413 


344/ 


EPSSWTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVFANGTL 
WKSVTDKDAGDYLCVARNKVGDDYWLKVDX^VMKPAKIEHKEE 
NDHKVFYGGDLKVDCVATGLPNPEISWSLPDGSLVNSFMQSDDS 
GGRTKR Y WFNNGTLY FNE VGMREEGDY TC FAENOVGKDEMRVR 
VKV\ T TAPAT1 RNKTCLAVQVPYGDWT VACEAKGE PMPKVTWLS 
PTNK\aPTSSEKYOIYQDGTLLlOKAQRSDSGNYTCLVRNSAGE " 
DRKTVW 2 HVNVQPPKI NGNPN FI TTVREI AAGGSRKLIDCKAEG 
IPTPRVLWAFPEGVVLPAPYYGNRITVHGNGSLDIRSLRXSDSV 
QLVCKARNEGG EAR L 1 VQLT VLE PM E KPIFHDPISEKI TAMAGH 
TISLNCSAAGTPTPSLVWVLPNGTDLOSGOOLQRFYHKADGMLK 
ISGLSSVDAGAYRCVARNAAGHTERLVSLKVGLKPEANKQYHNL 
VSI 1 NGETLKLPCTPPGAGQGR FS WTL PNGMKLEG P QTLGR VS h 
LDNC-TLTVREASVFDRGTYVCRI4ETEYGPSVTS1PVIV1AYPPR 
rTSEPTPVIYTRPGNTVKLNCMAMGIPKADITWEIjPDKSHLKAG 
VQARLYGNRFLHPQGSLTIQKATORDAG? YKCMAKNILGSDSKT 
TYIHVF 


6430 


1946 


602 


RTRVSTGbRRTLLWSEAVGASSTRGDTGIFGSGEGGAGPGGGEG 
AMLEAMAEPSPEDPPPTLKPETOPPEKRRRTIEDFl^KFCSFVLA 

QTFVKKAKSSKRRAAQAGPTOPGPPRSTFSRLOAPDSATLLEKM 
XLKDSLFDLDGPKVASPLSPTSbTHTSRPPAALTPVPLSQGDLS 
HP PR K KDR XNR KLG PGAG AG KG V LRR PR PT PGDGE KRSR I KK S K 
KRKLKKAERGDRLPPPGPPOAPPSDTDSEEEEEEEEEEEEBEMA 
TWGGEAPVPVLPTPPEAPRPPATVHPEGVPPADSESKEVGSTE 
TSQDGDAS SSEGEMRVMDEDI MVESGDDS WDLITCYCRKPFAGR 
PMIECSLCGTWIHLSCAKIKKTNVPDFFYCOKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSS YKIi PAYAPYLPC EACAtfQDGR KGGAYAG KMEATTAGVGR 
LEEEA.LRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNyVPEDEDLKKRRVPOAKPVAVEEKVKEQLEAAKPEPV 
IEEVDLANLAPRKPDWDLKRDVAXKLEKLKKRTQRA1AELIRER 
LKGQEDSLASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLGTMGS R I KONPETT FEVY VE VAY PRTGGTLSDPEVQRQFPE 
DYSDOE VbQTLTKFCFP FYVDSLTVSQVGQNFTFVLTDI DSKQR 
FGFCRLSSGAKSCFCILSYLPKrEVFYKLLNILADYTTKRQENO 
WNELLETLHKLPIPDPGVSVHLSVHSYFTVPDTRELPSIPENRN 
LTEYFVAVDVNNMLHLYASMLYERRIL1 1 C S KLSTLTACI HGS A 
AMLY PM YWQHVYI P VLP PHLLD Y CCAPMPYLIG IHLSLMEKVRN 
MALDDWI LNVDTNTLETPFDDLQSLFKDVI SSLKNRLKKVSTT 
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BNSOOCID: <WO_01S3312MJ_> 



WO 01/53312 



PCT/USOO/34263 



SEC 
IL 

NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! no 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spond i no 
to first 
amino acid 
residue of 
amine acid 
sequence 


Amino acid secment containing sianal peptide \ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, £& ; 
Glutamic Acid, F- Phenylalanine, G=GIycine, ] 
H=Histidine, I=Isoieucine, K= Lysine , 
L=ljeucine, M=Methionine, N=Asparagine, ; 
P=Proline, 0=Glutamine, R=Arginine, j 
S=Serine, T=Threonine, V^Valine, j 
W= Tryptophan, Y^Tyrosine, X= Unknown, *=Stcp i 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) ] 








TGDGVARAFLKAOAAFFGSYRNALKIEPEEPITFCEEAFVSHYR 
SGAMR0FLONAT0LOLFK0F1DGRLDLLNSGEGFSDVFEEEINM 
GE YAGSDKLYHO WLS TVRKGSGA I liNTVKTKANPAMKTVYKFD I 
AENGCAPTPEEOLPKTAPSPLVEAKDPKLREDRRPITVHFGQVR 
PPRPHWKRPKSNIAVEGRRTSVPSPEQNTIATPATLHILQKSI 
TKFAAKFPTRGWTSSSH 


64 33 


1S24 


464 


APVTKRKEVFAKDSKGSALDAGRDPKRPALPETLCESGWASNTA 
PTTPPQPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMLL 
LFLLLLLVATTGPVGALTDEEKRLMVELHNLYRAQVSPTASDML 
HKRWDEELAAFAKAYAROCVWGHNKERGRRGENLFAITDEGMDV 
PU\MEEWHHEREHYNI,SAATCSPGQMCGHYTQWV?AKTERIGCG ! 
SHFCEKLQGVEETNIELLVCNYEPPGNVKGKRPYOEGTPCSQCP 
SGYHCKNSLCE P I GS P E DAQDLP Y LVTEAPS FRATEAS DSR KMG 
AEGPDKPSWSGLNSGPGHVWGPLIiGLLLLPPLVLAGI F 


6434 


40 


2002 


MPOLNFGMADPTOMGGLSMLLLAGEHALGTPEVFSGTCRPDVSE ( 
SPELRQKSPLFOFAEISSSTSHSDASTKQCQTSALFQFAEISSN 
TSOLGG AEPVKRCGKSALKQLAEMCLAS EGMKMEES XL I KAKES 
DGGR1 KELEKGKEEKJE1 KM E XTDETRLQKEAE FEKS AKENLRBS 
KELRNFEALQI DDIMAI KMEDPKE IRKEELEEEHKCSHFPDFS Y 
SASSKI I 1SDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 
KSKMDRHGNDKS TPKKTCKKROSSESD I ESVIYT I EAVAKGDWG 
IEKLGDTPRKKWTSSSGKGSILDAKPPKKKVPCSREKKKSKEKS 

sdttkesrppdf: S I saskni sgetpegikaepltpmedalpps 
usgoak pedsdch r k i etcgs rksers ckgaly ktlvs egmlts 
lranvdrgkrssgkgnssdhegcwneeswtfsosgtsgskkfkk 
tkpkedcllgsakldeefekkfnslpqyspvtfdrkcvpvprkk 
kktgnvsseptktskgsgdkwsnkqlfldaihpteaifsedrnt 
mepvh kvknips i fntpeptttartfggqpkekskenpdys pcq 

DTQRAGYHHEEVUIMTNLJyiNlJCGGVYLKQliRHTAMTNA 


6435 


2227 


657 


ALQREAAAAYAHP E Y EERFLQEETVS QQ INS I ELLQTR PLALPE 
WKSCRPLQRQVKLRGRPASQPTVIRGITYYKAKVSEEENDIEE 
QODEF FSGDNGVDbLI EDOLLRHNGLMTSVTRRPAATRQGHSTA 
VTSDLNARTAPW5SAL.POPSTSDPSIANHASVGFTLQTTSVSPD 
PTRESVLQPSPQVPATTVAKTATQQPAAPAPPAVSPREALMEAM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPASPTLSPEEEDDIRNVI 
G RCKDTLS TI TGPTTQN TYGRNEG AWM KDPLAKEER I YVTNY Y Y 
GNTLVE FRNLEN FKQGR WSNS Y KLPYS WIGTGHWYNGAF YYNR 
AFTRN1 1 KYDLKQRYVAAWAMLHDVAYEEATPWRWQGHSDVDFA 
VDENG1>W1i1YPA1_'DDEGFSOEVIVLSK1»NAADJjSTQKETTVJRTG 
LRRNFYGNCFV1 CGVLYAVDS YNQRNANI SYAFDTHTNTQI VPR 
LLFENEYFYTTQ J DYNPKDRLLYAWDNGHQVTYHVI FAY 


6436 


1295 


341 


GACR PPVRQDPDSGPDYEALPAGATVTTHMVAGAVAG I LEHCVM 
YPIDCVKTRMOSLOPDPAARYRNVLSALWRIIRTEGLWRPMRGL 
NVTATGAGPAHALYFACYEKLKKTLSDVIHPGGNSH1ANGAAGC 
VATLLHDAAMNPAEWKQRMOMYNSPYHRVTDCVRAVMQNEGAG 
AFYRSYTTQIiTMNVPFOAIHFMTYEFLQEHFNPORRYNPSSHVL 
SGACAGAVAAAATT PLDV CKTLLNTQ ES J .ALNSH I TGH I TGMAS 
AFRT VYO VGG VT AY FRG VQARV I YQI PSTAI AW S VYE F F KYLIT 
KRQEEWRAGK 


6437 


1826 


360 - 


P P AP A P PA S P AR HVTRTARGHLBGGSRAP PLLQ A V FLQ I XNMVK 
L I HT IADHGDDVM CCAFS FS LLATCSIiDKTI RL Y S LRDFTELPH 
S P LKFHT YAVH CCCFSPSGHI LAS CS TDG TTVLWNTENGQMLA V 
ME0PSGS PVRVCQ FSPDS TCLASGAADGTWLVJNAQS YKL YRCG 
S VKDGSLAACA FS PNGSFFVTGSS CGDLTVWDDKMRCLKSEKAH 
DLGITCCDFSS0PVSDGEQGLQFPR1.ASCG0DCQVKIWIVSFTH 
1LGFELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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SEQ 
ID 

NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid cecment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=PhenyIaianine . G=Glycine, 
H=Histidine, :uisoleucine, K=Lysine, 
L=beucine, M=Kethionine, N=Asparagine, 
P=Proline, 0=c:» ut amine, R=Argir.ine, 

C-Cp-ri rip T-Thrpftni np V— \7r» 1 i tip 

W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQKTRYVTTCAFAPKTLrLLATGSMDKTVNIWQFD 
LETLCQARSTEH0LK0FTEDWSEEDVSTWLCAQDLKDLVG1FKM 

LSSG I PDEF I CP I TRELMKDPVI AS DGYS YEKEAMENWDPAKRN 
RTSPP 


64 38 


10S 


or-, 


r\;AT i d t\ vm Tr\ r Ffirzi i vi?vr*T t tv/^tpjI ahtct t ot/ot nnTi nt ktw 
Dvyi jjKrtnj*ir y i vj^ivl v r iVjjjjUAy i r^yrviuljrVFJjDvlbr JjI\ V 

NPALPLSPTGliAGSl-TNALSNGLLSGGLLGlLENLPLLDILXPG 

GGTSGGLLGGbbGKVTSVI PGLNNI IDIKVTDPQLLELGLVQSP 

CGHRLYVT3 PLGI KbQVNTPLVGAS LLRLAVKLD I TAE I LAVRD 

KVLPELVOGNVCPLWEVLRGLDI TLVHX)I VNMLIHGLOFVI KV | 


6439 


23 


422 


SIQTASAITTEMASCSQGIQQLLOAEKRAAEKVADARKRKARRL f 
KQA K EE AQM E VE 0 Y R REREHE FQS KQQAAMGS QGNLSAEVE QAT | 
RRQVQGMQS SOORNRERVLAQLbGMVCDVRPQVHPNYR I SA 


6440 


3 




RARWNS DMG DL PGLVRLS I AbR I Q PNDGPVP Y KVDGQR FGQNRT 

I klltgssykvevk: kpstlqveni s iggvlvplelkskepdgd 

RVVYTGT YCTEG VTPTKSG ERQP1 0 2 TMPFTDIGTFETVWQ VKF 
YNYHKRDHCQWGS ?FS VI EYECKPNETRSI/IWVNKESFb 


6441 


234 


1373 


KSGGLRR RQR PGRS A^VGE EELPPGKEKFKAAMLLGSVGDALGY 
RNVCKENSTVGMK I C'EEbQRSGGbDHbVLSPGEWPVSDNT I MH I 
ATAEALTTDYWCLDDLYREMVRCYVE I VEKLPERRPDPAT1 EGC 
AQLK PNN YLLA W H TP FNE KGS GFG AATKAMC I GLRY WKP ERLET 
LIEVSVSCGRMTKKHPTGFLGSLCTALFVSFAAQGKPLVOWGRD 
MbRAVPbAEEYCR KT3 RKTAEYQEHWFYFEAKWQFYLEERKI SK 
DSENKAIFPDNYDAEEREKTYRKWSSEGRGGRRGHDAPMIAYDA 
bLAAGNS WTELCH RA.MFKGGESAATGTI AGCLFGLLYGLDLVPK 
GbYQDbEDKEKbEDbGAALYRLSTEEK 


6442 


34 


796 


AEDPAGGLAGODTMFARGLKRKCVGHE EDVEGALAGbKTVS S YS 
bORQSbbDMSbVKbObCHMLVEPNLCRSVblANTVRQIOEEMTQ 
DGT WRT VAPQAA E RA P bDR LVSTE I bCRAAWG QEGAHPASG LGD 
GHTQG P VSDbCP V TS AQAPRHbQS S AW E MDG PR ENRGS FH K S bD 
OIFETbETKNPSCMEELFSDVDSPYYDbDTVbTGMMGGARPGPC 
EGbEGLAPATPGPSSSCKSDbGEbDKWEILVET 


6443 


2 




I4AS PAAS S VRF PR P K KEPQTbVI PKNAAEEOKbKbERbMKNPDK 
AVP I PE KMS E WAP R P P PE FVRDVMGS SAGAGSGEFHVYR KbRRR 
EYQRODYMDAMAEKOKbDAEFQKRbEKNKIAAEEQTAKRRKKRQ 
KbKEKKbbAKKKKbEOKKOEGPGQPKEOGSSSSAEASGTEEEEE 
VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPI PEPKPGDLI EI FRP FYRHWAI YVGDGYWHLA 

DDCP\/ir , K^7iiicxn^ciiT Tnvi TWifCT t vrnr&fiCTivvnxrNTM vun 
r ri>E* vfi\jJv&i\t\o Vnis/iJjJL jJfvAX VJVft-tJLiJbl JJVrtuoUAI y ViNjNAJllJ 

DKYSPbPCSKI I ORAESbVGQEVbYKbTSENCEHFVNELRYG VA 
RSDQVRDVIIAASVAGMGLAAMSblGVMFSRNKRQKQ 


6445 


2 


753 


AG AAGAAG AARS PR P Q AHTKG VRG LP S RRRS PDCGRMELAAGS F 
S EEQF WE ACAEbQO PA LAGAD WQbbVETS G I S I YRLbDKKTG LY 
EYKVFGVbEDCS PTbLADI YMDSDYR KQWDQYVKELYEQECNGE 
7WYWE V KY P FPMSNR D YVYbRQ RRDbDMEGRKIHVI liAR S TS M 
PObGERSGVlRVKQYKOSbAIESDGKKGSKVFMYYFDNPGGOlP 
SWLINWAAKNGVPNFLKDMARACONYLKKT 


6446 


1 


1651 


RCPTRSPPPDTPGSRGTTAMCSLASGATGGRGAVENEEDbPEbS 
DSGDEAAWEDEDDADbPHGKQQTPCLFCNRbFTSAEETFSHCKS 
EHQFNI DSMVHK.HGbE FYGYI KLINFIRLKNPTVEYMNS I YNPV 
PWEXEEYliKPVLEDDLbbQFDVEDLYEPVSVPFSYPNGLSENTS 
WEKLKKMEARALSAEAAIARAREDbOKMKQFAQDFVMHTDVRT 
CSSSTSVIADbOEDEDGVYFSSYGHYGIHEEMLKDKlRTESYRD 
FIYQNPH I F KDKWbD VG CGTG I bSM FAAKAGAKKVLG VDQS E I 



505 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
icsjauc oj 
amino acic 
sequence 


Predacted end 
nucleotide 
location 
corresponding 
to first 
amino ecid 
residue of 
amino oixu 
sequence 


Amino acid segment containing s:ynal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, K= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=HisCidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P^Proline, Q=Glutair : ine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
»v— x xyp l upndn , :-iyio3iiic/ A-un.*\riowii , "-otcp 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LY0AMD2 IRLMKLEDT1TL1 KGKI EEVHLPV EKVDV 1 J SEWMGY 

DRIAFWDDVYGFKJ^JSCMKKAVIPEAWEVLDPKTLISEPCGIKH 
I DCHTTS I S DLEFS S DFTLK I TRTSMCTA I AG YFD1 Y FEKN CHN 
RWFSTGPQS TKTHWKQT VFLLEKPPS VKAG E ALKG KVTVH KNK 

VHDDCl T\/T1 TT MMPTHTVPl f\ 
i\Utrt\t3U 1 V llj i LtNiMo lyi loJjy 


6447 


1554 j 


1068 


RLGPAEWHLSGPCHATLGAANRGRALGVRAAWRGAPLCQRVMMP 
SRTNLATG I PSS KVKYSRLS S TDDGY I DLQFKKTPPK I PYKAI A 
LATVLFL2GAFL1 1 IGSLLLSGYISKGGADRAVPVLIIGILVFL 
PG FYHLR I AYYAS KG YRGYS YDDI PDFDD 


6448 


7<i | 559 

i 


GOVLSHCYHYHSSRWRRGGLSRGRGAGVMALVPYEETTEFGLQK 
FH KPLATFS FAN HTI Ql RQDWRHLGVAAWWD.AA I VLSTYLEMG 
AVEbRGRS AVELGAGTGLVG I VAALLACR I R Y EREKNFLiAMLER 
QFI VRKVHYDPEKDVHl YEAOKRNQKEDL 


6449 


597 


1876 


EYGVCENLRKLElTGVSCRDVYAKbLHRYRHILGLWQPDIGPYG 
GIjLNVWDGLFI IGWMYLPPHDPHVDBPMRFKPLFRIHLiMERKA 
ATVECMYGHKGPHKGHI01VKKDEFSTRCN0TDHHRMSGGR0EE 
FRTWLREEWGRTJjEDI FHEHMQELI LMKF3 YTSQYDNCliTYRRI 
YLPPSRPDDLI KPGLFKGTYGSHGLEI VMLS FKGRRARGTK1TG 
DPfJ 1 PAGQQTVE1DLRHRI OLPDLENQRNFNELSRIVbEVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGE PGDAVAAAKOPAOCGGGQ PFVLPVGVS SRNEDYPRTCRM 
CFYGTGLIAGHGFTSPERTPGVFILFDEDRFGFVWLELKSFSLY 
SRVQATFRNADAPS PQAFDEMLKNI QSLTS 


6450 


84 B 


269 


FVPAPRTVSGKRSLFGEWEERGEGEQRTGREFSGNGGRAVEAAR 
KRLLCGLWLWbSLIjKVLOAOTPTPLPLPPPMCSFOGNOFQGEWF 
VLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWNAMTRGOHC 
DTWS YVL I PAAOPGQFTVDHR VWTHEQAGR PQDQPAGQELVAAS 
RDAGPVHLPGOS SGPLG 


6451 


232 




HS PTPPTS PRAS TMEDVKLE FPSLPQCKEDAEEW TYPMRREMQE 
ILPGLFLGPYSSAMKSKLPVL0KHGITHIICIRQNIEANF3KPN 
FCOLFRYLVLDI ADNPVENI 1 RFFPMTKEF1DGSLQMGGKVLVH 
GNAG I S R S AA F V I AY I METFG MK YRDAFA YVQER R FC I NPNAGF 
VH0LQEYEAIYLAKLTI0MMSPLOIERSLSVKSGTTGSLKRTHE 
EEDDFGTMOVATAQNG 


6452 


1 


652 


RTRGESSNMEPLAAYPLKCSG PRAXVFAVLLS I VLCTVTLFLLQ 
LKFIjKPKI NSFYAr EVKDAKGkTvSIjEKx KGK VijJjV VNVAoJJLy 
LTDRNYLGLKELKKEFGPSHFSVXAFPCNQFGESEPRPSKEVES 
FARKNYGVTFP1FHKIKILGSEGEPAFRFLVDSSKKEPRWNFWK 
YLVNPEGOVVKFWRPEEPIEVIRPDIAAIiVROVl I KKKEDL 


6453 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 

DPG1YKCVVCGTPLFKSETKFDSGSGV7PSFHDV3NSEAITFTDD 
FS YG MHR VETS CS QCG AHLGH I FDDGPR PTG KR YC I NS AALS FT 
PADSSGTAEGGSGVASPAOADKAEL 


6454 


827 


223 


HRRWLPGLSMSFRRTLPRPLSLCLSbCLCljCbAAALGSAQSGSC 
RDKKNCKWFSQQEbRKRLTPLOYHVTQEKGTESAFEGEYTHHK 
BPGIYKCWCGTPLFKSETK?DSGSGWPSFHDVIt?SEAITFTDD 
FSYGMHRVETSCSOCGAHLGH I FDDGPR PTG KRYC INS AALS FT 
PADS SGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


KVHLATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKI>EDFINNINSVLE 
SLYIEI KRGVTEDDGRPI YALVNLATTS ISKMATDFAENELDLF 
RKALELI3DSETGFASSTNILNLVDQLKGKKRRKKEAEQVL0KF 
V0NKWL2EKEGEF7LHGRA1LEMEQYIRETYPDAVK1CNICHSL 
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BNSDOCID <WO 0153312At_l_> 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 1 
location 
corresponding 
to first 
amino acic 
residue of 
amino acic 
seauence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing sicnai peptide 
{A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acic, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparsgine , 
P=Prolane, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, - =Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRKHLPCVAKYFQSNAEPRCPHCNDYWPKEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


RPQSRSISMWRNSLLQVSSGLRWLRVCAMVDILGERKLVTCKGA 
TVEAEAALONKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTA 1 PKLVI VKQNGEVITNKGRKQI RERGLAC FOD WVEAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFFDFDKKIPV 
KLF P L PLL YVGNH I S G LS S TS KLS LPMFTVLR K FT I P LT LLLE T 
1 1 LGKQYS LNI I LSVFAI I LGAFI AAGSDLAFNLEGY I FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTLI ISVSTG 
DLQOATEFNOWKNWFUjQFLLSCFLGFLLKYSTVLCSYYNSAL 

ttawgai knvs vayig i l iggdy i fsllnf vglni cmagglr y 
sfltlssolkpkpvgeenicldlks 


6458 


23 


692 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI 3 HFPDFDKXIPV 
KLFPLPLLYVGNH I SGLSSTSKLS LPMFTVLR KFT I FLTLLLET 
1 1 LG KQY S LN 1 1 LS VFA 1 1 LG AF I AAGSDLA FN LEG Y 1 FV FLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFM3 I PTLI ISVSTG 
DLQQATEFNOWKNWFILQFIiLSCFLGFLLMYSTVIjCSYyNSAL 
TTAWGAI KNVS VAYIG I LIGGDYI FSLLNFVGLNICKAGGLRY 
SFLTLSSCLKPKPVGEEN I CLDLKS 


6459 


23 




PTTGFPVTNFPWNWPDGKPPIM1LYVSKLNK1 : HFPDFDKKIPV 
KLFPLPLLYVGNH I S GLS S TS KLS LPMFTVLR K FTI PLTLLLET 
1 1 LG KQ Y S LN 1 1 L S VFA 1 1 LGAFI AAG SDLAFN LEG Y 1 F V FLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMT I PTLI 3 SVSTG 
DLQO ATE FNQW KNWFI LQ FLLSC FLGFLLMYS T VLCS Y Y NS AL 
TTAWG A I KNVS VAY I G I L I GGDY I FS LLN F VG LN I CHAGGLR Y 
SFLTLSSOLKPKPVGEENI CLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNVIPDGKPPIMILYVSKLNK1 3 HFPDFDKKI PV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRK FTI PLTLLLET 
1 1 LG K0 Y S LN 1 1 LS V FA 1 1 LGAFI AAGSDLA FN L EG Y 1 FV FLND 
IFTAAiaGVYTKQKMDPKELGKYGVLFYNACFM: I PTLI ISVSTG 
DLQQAT E FNQW KNWFI LQ FLLSC FLG FLLMY S 7VLCS Y Y N S AL 
TTAWGAIKNVSVAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 
SFLTLSSOLKPKPVGEENI CLDLKS 


6461 


1653 


360 


LQQRTLR l TAVGQTHPI AWMAWEPS LG AFYGFAS F ITFVN CMY F 
LSIF10LKRHPERKYELKEPTEEQQRLAANENGEINH0DSMSLS 
LI STS ALENEHT FHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS F VFGATSLS FS AFF WHHCVNREDVRiAVI I MTCCPGRSS 
YSVQWVQPPNSNGTNGEAPKCPNSSAESSCTKKSASSFKNSSQ 
GCKLTN LOAAAAOCHANSLPLNSTPQLDNSLTEHSMDN DI KMHV 
APLEVOFRTNVHSSRHHKNRSKGHRASRIjTVLREYAYDVPTSVE 
GS VQNGLPKSRLGNNEGHoKbKKAY IjA J Kh-Ky i N r viJbi>iiAC 
STLP^SP^FEKPVSCTSKKDALRKPAVVELENOQKSYGLNLAI 
QNGPIKSNGQEGPLLGTDSTGNVRTGLWKHETTV 


6462 


3 


773 


S EELDR E KKLKJSDS PRK I PWKtSG VPSL f vbL>lz>l ivt. i rKfcAXn 
PDSOSMEESKLKNDDRKTPVNWKDSRGTRVAVESPMSQKOSYIQ 
YLHAYPYPQMYDPSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYG 
KMSGREETEKVNTSPSVNTKTTTESKALDLLOOHANQYRSKSPA 
P VEKATAER EREAE RSRDRHS P FGQRHLHTH HH TH VG MGY PL I P 
GQYDPFQGLTSAALVASQQVAAQASASGMFPGORRE 


6463 


2 


350 


VILCILGGWIFKNADRSMEKKKGEPRTRAEARPV7VDEDLKDSSD 
LHOAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSOEFYELL 
NKRRS VR F I SNEQVPMEVI DNV IRTAGL 


6464 


12 


1154 


GILROKEREERNRIHKKE1LFLEHLLWPSEMSSLSGKVQTVLG 
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SEC 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F=Phenylalamne , G=Glycine , 
K=Histidine, I=Isoleucine, K=l>ysine, 
L^Leucine, M=Methicnine , N=Asparacine , 
P=?roline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^VaHine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\-nneoihlo mirl ftrth ^ rfp * n e;*^ v** -i nn \ 
\ — }JUo Si J- iJ ±. C JIUV.JCVJLJ'UQ i J. Ul* / 








LVEPSKLGRTLTHEHLAMTFDCCYCPFPPC0BA3SKEPIVMKWL 
yWIOKNAYSKKENLOLNOETEAIKEELLYFKANGGGALVENTTT 
G I SRDTQTLKR LAE ETGVH 1 1 SGAGFYVDATHS SETRAMS VEQL 
TDVLMNE I L! 1G ADGTS I KCGI IGE IGCSWPLTESERKVLQATAH 
AQAQLGCPVI IHPGRSSRAPFQI I R I LOEAGAD I S KTVMSHLDR 
T1LDKKELLEFAQU5CYLEYDLFGTELLHYQLGPDIDMPDDNKR 
IRRVRLLVEEGCEDRILVAHDIHTKTRbMKYGGHGYSHILTNW 
PKMLLRGITENVLDKILIENPKQWLTFK 


6465 


126 


1396 


K/ITVFFKTLRNHWKKTTAGLCLLTWGGHWLYGICHCE^LLRRAAC 
OE AO V FGNQL 1 PPN AQ V K KATV FLN P AA CKG KAKTL FE KNAAPI 
LHLSGMDVTI VKTDYEGQAKKLLELMENTDVI I VAG GDGTLQEV 
VTGVLRRTDEATFSKIPIGFIPLGETSSLSHTLFAESGNKVQHI 
TDATLA I V KG E T VP LD VLQ I KG E KEQ P VFAMTGLR WGS FRDAGV 
KVS KY WYLEPLK2 KAAH FFSTLKEW PQTHQAS I S YTGPTERPPN 
EPEETPVQRPSLYRRILRR1ASYWAQP0DALS0EVSPEVWKDVQ 
LSTIELS1TTRNN0LDPTSKEDFLNIC1EPDTISKGDFITIGSR 
KVRN P KLEVEGTECLQAS QCTLL1 PEG AGGS FS 1 DSEE YEAMPV 
EVKLLPRKLQFFCDPRKREQMLTSPTC 


6466 


1124 


828 


VARGTELSOLEKAHPPADMGRRKSKRKPPFKKKMTGTLETQFTC 
PFCNHEKSCDVKMDRARNTGVI SCT VCLEE FQT P I TYLSEPVDV 
YSDWIDACEAANQ 


6467 


301 


2571 


GELRVLAIiAHGELACHAVbTASLLSLRSRLMDSDMDYERPNVET 
IKCVWGDNAVGKTRLICARACNATLTQYQLLATHVPTVWAIDQ 
YRVCOEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFS IANPNSX.HHVKTMWYPE1 KHFCPRAPVILVGCQLDLRY 
ADLE A VNRAR R P LAR F I KPNE I L P PE KG REVAKE LG I P Y YETS V 
VAOFG I KDVFDNAI RAAL I SRRHLQPWK S HL-RNVQRPLLQAP FL 
PPKPPPP1IWPDPPSSSEECPAHLLEDFLCADVILVLQERVRI 
FAHKIYLSTSSSKFYDLFLMDLSEGELGGPSEPGGTHPEDHQGH 
SE0HHHHHH H H HG RD FLLRAAS FD VCE S VDEAGG S G PAGliRAST 
SDG1 LRGNGTG Y LPGRGRVbSSWSRAFVS 1 QEEMAEDP1»TYKSR 
LMVWKMDSS I QPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFDLRMKVAH I LNNE AFS^NQEITKAFHVRRTNRVKECLAKGT 
FS DVT F I LDDGT I S AHKPLL I SSCDWMAAMFGG P FVES STREW 
FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLIILANRLCLPHL 
" VALTEQY TVTG LMEATQMMVD I DGD VLV FLELAQ FH CA YQ1AD W 
CLHK1CTNYNNVCRKFPRDMKAMSPEN0EYFEKHRWPPVWYLKE 
EDHYQRARKEREKEDYLHLKRQPKRRWLFWNSPSSPSSSAASSS 
SPSS S SAW 


6468 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRIiAAGLRLLPMLGLLOLLAEPG 
LGR\mKIJUJKDDVRHKVHLNTFGFFKDGY^WVI^VSSI>SL^ T EPED 

yn^Tirpct nDTVMnrrccVT.npmnJYPT T.VtfO'iVQVTT.T.Tl.'nT 
KUvi Jur Ji-iiJK i j\r*U\jr I Jj.Ur.Uv.rt I V. J. ±jj\i\\JCj VOVl lhjx uu a. 

SRSEVRVKSPPEAGTQLPKI I FSREEKVLGQSQEPNVNPASAGN 
0T0KTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFOFFFNIS 
TDDQEGLYSLYFHKCLGKE1.PSDKFTFSLDIEITEKNPDSYLSA 
GE I PLP KLY 1 S MAFFFFLSGTI Wl H ILRK RRNDVFKI HWLMAAL 
PFTKSLSiiVFHAI DYHYISSQGFPIEGWAWYY1 THLLKGALLF 
I TIAL1 GTGWA FI KH I LSDKDKK I FMI VI PRRVliANVAYI I IES 
TEEGTTE YGLV3KDSLFLVDLLCCGAI LFP WWS I RHLQEASATD 
GXGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTN MAAL A P VG S PASRGPRLAAGLR LLPMLGLLQLLAEPG 
LGRVHHIALKDDVRHKVHLNT FG F FKDG Y MWNV S S 1>S LNE PED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKXQSVSVTLLILDI 
SRSEVRVKS PPEAGTQLPKI IFSRDEKVLGQSQEPNVNPASAGN 
QTOKTQDGGKSKRSTVDSKAMGEKSFSVmWGGAVSFQFFFNIS 
TDDOEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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SEO 
ID 
NO: 


Predicted 
beainnino 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

T"ll 1 f 1 pot" "i dp 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

IM-HXdJllIIC/ V.-vybLClilC( U~f\5 D"i L1C AC J.U , t = 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine. 
L=lreucine / M=Methionine , N=Asparagine , 
P= Proline, O-GIutarnine, R^Arginine, 

Q — ^p'r'i'rip T=Threoninp U-l/al i np 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEI PLPKLYI SMAFFFFLSGT1 WI HILRXRRNDVFKIHWIjMAAIi 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
1TIAL2GTGWAFIKHILSDKDKKIFM1VIPRRVLANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPVWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 


6470 


272fc 


1437 


AAASGVSSRADAPVLAQSPASAGNGRPSTPRVPGSRRHPSAFRS 
OPLikKElAjCRi PGPv^kPLPGAliIjKPRTLLSSAAETGRSRHPDT 
OHPSSGGRCRGGTES P SS AAGR P AS MAE AE EDCHSDT VRADDDE 
ENES PAETDLQACl»OMFRA0WMFELAPGVSSSNLENR PCRAARG 
SLOKrSADTKGKOEOAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLV PD.T E FKI TYTRSPCGDGVGNS YI EDNDDDS KMADLLS 
YFOQ0LTFOESVXKLCQPELESSOIH1SVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
I PYTSWREMFLERPRVRFDGVYISKTTY1 RQGEQSLDGFYRAWHQ 
| VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


29S 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDFALRRRRR 
GPRNKKRGWRRLAQEPLGLEVDQFLEDVRLOERTSGGLLSEAPN 
EKLFFVDTGSKEKGLTKKRTKVQKKSLLLKKPLRVDL1 LENTS K 
VPAPKDVL^QVPNAKKLRRKEOLWEKLAKOGELPREVRRAQAR 
LLNPSATRAKPGPQDTVERPFYDLWAEDNPLDRPLVGODEFFLE 
OTKKKGVKR PAR LHTKPSQAPAVE VAPAGAS YNPSFEDHQTLLS 
AAHE VELQROKEAEKLERQLALPATEOAATOE S TFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 

\o-:rlrvcx?aalraarlrhoelfrlrg i kaqvalrlaelarrqrr 

ROARREAEADKPRRLGRLKYQAPDIDVObSSEl,TDSLRTLKPEG 

Nl lrdrfks forrnm I EPRERAKFKR ky kvklvekrafre I QL 


6472 


3 


897 


S CGS DRAQWAMEFPFDVDALFPER I T VLDOHLRPPARR PGTTTP 
AR VDLQQQ I MT 1 1 DELGKAS AKAQNLS A P I TS AS RMQSNR HW Y 
I LKDS S AR PAG KG A 1 1 G F I KVGY KKL FVLDBR E AHNE VE P LC I L 
DFYIHESVQRHGHGRELFQYMIiOKERVEPHQLAIDRPSOKLl,KF 
LNKKYNLETTVPQVNNFVlFEGFFAHQHRPPAPSbRATRHSRAA 
AVDPTPAA PAR KLPPKRAEGDI KP YS SS DREFLKVAVEPPWPLN 
RAPRRATPPAHPPPRSSSLGNSPERGPLRPFVP 


6473 


22 


912 


SSAVEFVWEGEKMAAEPNXTEIQTLFKRLRAVPTNKACFDCGAK 
N PS WAS I TYGVFLCI DCSG VHRSLG VHLS FI RS TELDSNWNWFQ 

T D r > Mr\\Tnr , Ktnt3'TL r rn Tr'n*Drvu/ , >/" , T , lvKiriiix"TT'vVMCD t\ 7\ nMVD pvt r>r\ 
LtH^rVJ VUuNAWAl At* rKy.m»t- iANUAN 1 iVXIVorCAAyfJ I KLAi. KU 

LGS AALARHGTDLWI DNMS SAVPNKS PE KKDSD FFTEHTQPPAW 
DAFATEPSGTQQPAPSTES SGLAOP EHGPNTDLLGTS PKASLEL 
K S S 1 1 GKKKPAAAKKGLGAKKGLGAQKVSSQSFSE I ER0AQVAE 
KLRE00AADAXK OAEESKVASMRLA YQEL0I DR 


6474 


3 


462 


LORORQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KG K KE E KQE AG KEG TAPS ENGETKAEE I H I S RS TVNVSTS RGTP 
PSTLSVKGQIETVRVKGTBN 


6475 


3 


462 


uOROROHPAAAPAVPVRCFTFCFTDIVlMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAXKEPGAKISRGA 
KG KKE EKQEAG KEGTAPS ENGETKAE E I H I S RS TVNVSTSRGT P 
PSTLS VKGQI E TVR VKGTEN 


6476 


106 


1090 


ARAMAQ YKGTMREAG RAMHLLKKR ERQREQME VLKQR I AEET I L 
KSOVDKRFSAWYDAVEAELKSSTVGLVTLWDMKARQEALVRERE 
RQLAKROHLEEQRLOOSRCREQEORRERKRKISCLSFALDDLDD 
OADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAOREKVKDEEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
OGLRKDFLELRSAGVEQLMFI KEDLILPHYHTFYDFI I ARARGK 
S GPLFSFDVHDDVRliLSDATME KDESHAG KWLRS WYEKN KHI F 
PASRWEAYDPEKKWDKYTIR 
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ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end. 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Jiminc acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Pbenylalanine, G=Glycine, 
H=Histidine, 1= Isoleucine , K=Lysine, 
L=Leucine, M-Methionine , N^Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=T'yrcsine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6477 


227 


915 


LQGHLMG I MAASR F LS RFWEWG K» I VCVGRN YADH VREMRS A VL 
SE P VLFLKPS TAYAPEGSP I LM PA YTRNLHH KbELGWMGKRCR 
AVPEAAAMDYVGGYAbCLDMTARDVQDECKKKGLPWTLAKSFTA 
SCPVSAFVPKEKI FDPHKLKLWLKVNGELRQEGETSSMI FSIPY 
I 2SYVSK1 ITLEEGDI ILTGTPKGVGPVKENDEISAGIHGLVSM 
TFKVEKPEY 


647B 


2 


1495 


FVSSRILPESLASSEASTLEAT^GRKSEEDCSSWKKOTTNIRKTF 
1 FMEVLGSGAFSEVFLVK0RLTGKLFALKC1 KKSPAFRDSSLEK 
E1AVLKKIKHENIVTLEDIYESTTHYYLVM0LVSGGELFDRILE 
RGVYTEKDASLVIOOVLSAVKYLHENGIVHRDLKPENLLYLTPE 
EN.S XI MI TDFGLS KMEQNGIMSTACGTPGYVAPEVLAQKPYSKA 
VDCWSIGVITYILLCGYPPFYEETESKLFEKIKEGYYEFESPFW 
DDI SESAKDFI CHLLEKDPNERYTCEXALSKPWI DGNTALHRDI 
Y PS VS1/QIQKN FAKS K WROA FNAAAWHHMRKLHMNLHS PG VRP 
EVENRPPETQASETSRPSSPEITITEAPVLDHSVAbPALTOLPC 
QKGRRPTAPGGRSLNCLVNGSLH ISSSLV FMHQGSLAAGPCGCC 
SSCLNIGSKGKSSYCSEPTbbKKANXKQNFKSEVWVPVXASGSS 
HCRAGQTGVCLIM 


6479 


3 


94 5- 


SCRGPGWHPAGGQAGAMELLSALSLGELALSFSRVPLFPVFDbS 
Y F I VS I L YLKYE PG AV ELS R RII P I A S WLCAMLHC FG S Y I LADLL 
LGEPLI DYFSNNS S I LLASAVWYLI FFCPLDLFY KCVCFLP VKL 
I FVAMKEWRVRKIAVGIHHAHHHYKHGWFVM1ATGWVKGSGVA 
LftS NFEQLLRGVWKP E TNE 1 LHMS F PTXASLYG AI LFTLQQTRW 
LPVSKASLIFIFTLFEWSCKVFLTATHSHSSPFDALEGYICPVL 
FGSACGGDHHHDNHGGSHSGGGPGAOKSAMPAKSKEELSEGSRK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPPYLT^SAKMTEVMMNTQPMEElGLSPRKDGbSY 
01 FPDPSDFDRCCKI .KDRLPS I WEPTEGEVESGELRNPPEEFL 
VCEDEQDNCEETAKEKKEO 


6481 


110 


1131 


KSRMDLDWNMFV I AGGTbAI P I LAFVAS PLbWP SAbI R I Y YWY 
WRRTbGMQVRYVHHEDYOFC i S FRGRPGHKPS 1 LMLHGFSAHXD 
MWbSWKFLPKNbHLVCVDMPGHEGTTRSSLDDLSIPGQVKRIH 
OFVECLKLNKKPFHLVGTSMGGOVAGVYAAYyPSDVSSbWLVCP 
AGLQYSTDNQFVORLK ELOGS AAVEK I PLI PSTPEEMSEMLQLC 
S YVRFKVPQQI LQGbVDVR 1 PH1W FYRKLFLEI VSEKSRYSLHQ 
NMDXIKVPTQI IWGKODQVLDVSGADMLAKS IANCQVELLENCG 
HS WWERPRKTAKLI I DFLASVHNTDNNKKbD 


6482 


2517 


56e 


E P V S K VS OS R R KAG V P TAN I E E S QAVE AAMANVP WAE VCEK FQA 
ALAbSRVELHKNPEKEPYKSKYSARALLEEVKALbGPAPEDEDE 
R PEAEDG PG AGDHALG LPAE WE ? EG PVAQRAVR LAV 1 3 FH LG V 
NHIDTEELSAGEEHLVKCLRLLRRYRLSHDCISLCJQAQNNLG1 
LWSEREEIETAQAYLESSEALYNOYMKEVGSPPLDPTERFLPEE 
EKLTEQERSKRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 
XRQLEHNAYHPI EWAI NAATLSQFYINKLCFMEARHCLSAANVI 
FGOTGKISATEDTPEAEGEVPELYHORKGEIARCWIKYCLTLMQ 
NAQLSMQDN I G ELDLD KOS ELRALR KKE LDE EES IRKKAVQFGT 
GELCDA1 SAVEEKVSY LRPLDFEE ARELFLLGQHYVFEAKEFFQ 
I u\3 1 v i un 1 1, v vyuj-ibAj_ir KtjJjAr r r. I i^ribKKUivj^nJSjtKX/iniJt, 
PLTVDLNPOYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLAKFRVARLYGKUTAPPKKELENLATSLEHYKFIVDYCEKHP 
EAAQ5IEVELELSKEMVSLLPTKMERFRTKMALT 


6483 


3 


622 


NSHLLCXSLPJU^PLSANGREAPJWEQRLAEFPJVARKRAGLAAQP 
PAASQGAQTPGEKAEAAATLKAAPGWLKRFLWKPRPASARAOP 
GLV0EAAQPQGSTSETPWNTAIPLPSCWDOSFLTNITFLKVLLW 
LVLLGLFVELE FGLAY FVLSLFY WMY VGTRG PEE KKEGEKSAYS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotioe 
location 
correspcr.ding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signs! Deptice 
(A=Alanine, OCysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, 3=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine . 
P=Proline, Q=Glut:amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 1 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VFNPGCEAIQGTLTAEObERELOLRPLAGR | 


6484 


201 


965 


QLAVKTKMSGLRPGT0VDPE1 ELFVKAGSDGSSIGNCPFCQRLF 
MILWLKGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 
DFI KIEEFLEQTLAPPRY PHLSPKYKESFDVGCNLFAXFSAY1 K 
NTQKEAI>IKNFEKSLLKEFKRLDDYLNTPLLDEIDPDSA5EPPV£ 
RRbFLDGDQLTLADC SLL P KLN 1 1 KVAAKKYRD FD I PAE F SG VW 
RYLHNAYAREEFTHTCPEDKEIENTYANVAKOK^ [ 


64B5 


6 


lOSi 


FVDLVRAVEFLPCFDSOKLEKECQSSEESMGSNSMRS I LEEDEE 
DEEPPRVliIiYHEPRSFEVGMLWJHKHKKYPFWPAWKSVRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSL ITDYRVRLGCGSFAGS FLEYYAADISY PVRKS I Q 
nnuT ^twt.dht .q kciq PFPD\A/r:r , PT .anvCit>c*v> fnviT.pnp <i p zv&pri 

VUViio 1 t\lJfiJL>Z> IWjSj rLtrV v\J<,r ±J\J\J tr\^ K l\l ILj r LJ t\ O t\f\r\i\Lj 

RANQKLVEYZ GKAXC-AESHbRAl LKSRKP SRWliQTFLS S SQYVT 
C VE TYLEDEGQLDLVVK YLQG VYQE VG A KVLQRTNGDR I R F I LD 
VLL PE AI 1 CA I S AGD EVD Y KTAEEKY IKGPSLSYREKEI FDNQL 

T FFPKIUPPP 


6486 


10 


561 


LVLQAGG AH LS P S R VTQG I Y YMLAFS EMPKPPDYSELS DS LTLA 
GGTGRFSGPLHRAWRMMNFRORMGWIGVGLYLLASAAAFYYVFE 
I S ETYNRLALEHI QOHPEE PLEGTTWTHSLKAQLLSLP F W VWTV 
2 F LVPY LQMFLF LYS CTRADP KTVG Y C 1 1 P I CLAV I CN R HQ AF V 


6487 


352 


863 


S FLKPLRGKMS VTLHTDVGDI Kl E V FCE R T P KTCEN FLALCASN 
YYNGCI FHRHI KGFMVQTGDPTGTGRGGNS I WGKKFEDE YSEYL 
KHNVRGWSMANNGPNTNGSQFFITYGKOPHLDMKYTVFGKVID 
GLETLDELE KLP VNEKT Y R P LNDVH I KDITIHANPFAQ 


6488 


87e 


241 


TALQEFGTSGPPLSLRFALPSGTGRFKPJbFGARGPSWPFSPRVP 
ME PPNLY PV KJLYVYDLS KG bARRLS P I MbGKQbEG 2 WHT S I WH 
KDEFFFGSGGISSCPPGGTLLGPPDSVVDVGSTEVTEEIFLEYb 
SELGESLFRGEAYNLFEHNCNTFSNEVAOFLTGRKIPSYITDLF 


6489 


1457 


375 


KVAKMATALSEEELDNEDYYSLLNVRREASSEELKAAYRRLCML \ 
YHPDKHRDPELKSOAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWEWERRRTPAE1REEFERLQREREERRLQQRTNPKGT 
I SVGVDATDLFDR YDEEY EDVSGSS FPQI EINKMH I SOS 1 EAPL 
TATDTAILSGSLSTQNGNGGGSINFALRRVTSAKGWGELEFGAG 
DLQG PLFGLKbFRN LTF RC FVTTNCALQ F SSRG I R PGLTT VLAR 
NLDKNT VGYLQWHCS S PLLOVQRPHRNTRACAPE PS FRP FLHVP 
TWUAECSGARTPSTAKTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCEVWLGYGP RAAAAAAATVLFGGAG PTETM FVARS I AADH 
KDLIHDVSFEFHGRRMATCSSDQSVKVl^DKSESGDWHCTASWKT 
HSGSVWR VTWAHPEFGOVIoAS CSFDRTAAVWEE I VGESNDKLRG 
QSHVmCRTTLVDSRTSVTDVKFAPKHMGLMLATCSADGlVRIYE 
APDVMNLS QWS LQHE I SCKLS CSCISWNPSSSRAHSPMIAVGSD 
DSSPNA14AKV01 FE YNENTR KYAXAETLMTVTDPVHDI AFAPNL 
GRS FHI LA I ATKDVR I FTLK P VRKELTS S GGPTKFE I HI VAQ FD 
NHNSQVWR VSWNI TG TVLAS S GDDGCVR bWKANYMDNWKCTG I Ii 
KGNGSPVNGSSQQGTSNPSLG SNIPS LQNSLNGS SAGRKKS 


6491 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPTETMFVARS1AADH 
KDLIHDVSFDFKGRRMATCSSDQSVKVWDKSESGDWHCTASVJKT 
HS GS WIR VTWAH P E FGQVLAS CS FDRTAAVWEE I VGES NDKLRG 
QSHWVKRTTLVDSRTS VTDVK FAPKHMGLMLATCS ADG I VRI YE 
APDVMNLSOWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNEOTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQVWRVSWNI TGTVLAS SGDDGCVRLWXANYMDNWKCTGI h 
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SEO 
ID 
NO: 


Predicted 
beginning: 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue c? 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acac eeoment containing signal peptide 1 
(A^Alanint, C=Cysteine, D=Aspartic Acid, fc= 
Glutamic Acid, F=Phenylalanine, G^Glycine., ; 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine ( M=Methionine, N=Asparagine , 
P=Prolir.e, Q=Glutamine, R=Arginine, 
S=Serine, 7=Threonine, V=Valine, 
W=Tryptophan, Y=?yroeine, X^Unknovm, *=Stop 
Cocon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




| | KGNGSPVKGSSO0GTSNPSI>GSNlPSLQNSbNGSSAGRKHS 


6492 


34 


2573 


IPFLKSCCCCCbFDFPPPPLDQVQEEECEVERVTEHGTPKPFRK 
FDSVAFGESOSEDEOFENDLETPPPNWQQLVSREVLLGLKPCEI 
KR0EV1NELFYTERAHVRTLKVLDQVFYQRVSREGILSPSELRK 
1 FSNLED1 LOLH 3 GLNE0MKAVRKRNETSV1 DQIGEDLLTWFSG 

rot&J\bl\xiA/iA J r LSIMjrr AiJl^lK^KURAJJoKrUl r vSjUJ^CSr* 

PLCRRLQLKD2 I PTQMQRLTKYPLLkDNIATYTEWPTEREKVKK 
AADHCR01 LNY VNO AVKE AENKQRLEDY QRR LDTS S LKLS EY PN 
VEEXjRNLDL'J'KK KM I HEGPLWKVNRDKTIDL YTLLLE DI LVLL 
CKQDDRLVLRCHSKILASTADSKHTFSPVIKLSTVL^OVATDN 
KALFVI SMSDNGAQ1 YELVAQTVSEKTVWQDL1 CRMAASVKEQS 
TKP1PLP0STPGEGDNDEEDPSKLKEE0HGISVTGLQSPDRDLG 
LESTLISSKP0SKSLSTSGKSEVRDLFVAERQFAK50K7DGTLK 
EVGEDYOI A I PrSHLPV5EERWALDALRjnjGLLKOLuVQQLGL,T 
EKS VQEDWCH FPRY RTAS QG PQTDS VIQNSEN I KAYHSGEGHMP 
FRTGTGDIATCySPKTSTESFAPRDSVGLAPODSOASNILVMDH 
MIMTPEMPTMEPEGGXjDDSGEHFFDAREAHSDENPSEGDGAWK 
EEKDWLRISGKYLILDGYDPVQESSTDEEVASSLTLQFMTGIP 
AVESTHOQOHSPONTHSPGAISPFTPEFLVQC'RWGAMEYSCFEI 
QSFSSCADSOSQIMEYIHKIEADLEHLKKVEESYTI1.COR1AGS 
ALTDKHSDK5 


6493 


557 


1147 


TPARMAYQSSS TS DCMSKTLDSASAHFAASA VVSAP V PS RS EVA 
KEONTGHNN 1 KG WOPSGTS KTLYS TNMAbS SSPGJS AVQLVRT 
VGHTTTNHL2PALCTSSPOTLPMNNSCLTNAVHLNNVSWSPVN 
VH I NTRTS A PS P TALKLAT VAASMDR VPKVTP SS AI SSI AR ENH 
EPERLGLNG1AETTVAMEVT 


6494 


2425 


1052 


AVAGGARPCSTPSSPMRRCRRHRPRPLPKPPAAIMSASAVYVLD 

VRFKW I KHNNLYLVATSKKNACVSLVFSFLY KWQVFSEY FKEL 
SEES IRDNFVI 1 YELLDELMD FG YPQTTDS K J 3L.QE Y I TQ EGH KL 
ETGAPRPPATVTNAVS WRSEG I KY RKNEVFLDV3 BSVNLLVSAK 
GNVLR SEIVGS1 KMRVFLSGMPELRliGLUDKVLFDNTGRGKSKS 
VELEDVKFHOCVRLSRFENDRTISFIPPDGEFELMSYRLNTHVK 
PLIWIESVIEKHSKSRIEYMIKAKSQFKRRSTAJ^KVEIHIPVPK 
D ADSP KFKTTV G S V KVJ VPEN S E I VWS IKS FPGG KE Y LKRAH FGL 
PSVEAEDXEGKFPISVKFEIPYFTTSGIQVRYLKIIEKSGYOAL 
PWVRYITONGDYOLRTQ ' 


6495 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAA1KSASAVYVLD 
LKGKVLI CRNYRGDVBN3SEVEHFMPI LMEKEEEGMLS P I LAHGG 
VRFMWIXHNMLYLVATSKKNACVSLVFSFIiYiCWCJVFSEYFKEL 
EEES IRDNFVI I YELLDELMDFGYPQTTDSXI LQEY I T QEGHKL 
ETGAPR PPA7VTNAVS WRSEG I KYR KNEVFLDVIES VNLLVSAN 
GNF^RSEIVGSIKMRVFLSGMPELRLGIjNPKVLFDKTGRGKSKS 
VELEDVKFHOCVRLSRFENDRTISPIPPDGEFELMSYRLNTHVK 

pl.iwiesviekkshsrieymikaksqfkrrstannveihipvpn 
dadspkfkttvgsvkwvpenseiwsiksfpggkeylmrahfgl 
psveaedkegkppisvkfeipyfttsgiqvrylkiieksgyoal 
pwvryitqngdyqlrtq 


6196 


247 


559 


LRAV S L>L P LQZj VLP EYSIHSLFCI M FbCAQE W bTLGLNV PLL F Y 
HFWR Y FHCPADSSELAYDP PVVMNADTLS YCOKEAWCKJLAFYTjL 
SFFYYLYCMIYTLVSS 


6497 


1053 


352 


ANT0ICRLCPRRHLKPPCGAKKGNGTEEDYNFVFKWL1GESGV 
GKTNLLSRFTRKEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
T AGI.ER YRA I TS AYY P.G AVGAIjLVFDLTKHQT Y AWER V] LKE LY 
DHAEATIWMLVGNKSDLSQAREVPTEEARMFAENNGLLFIjETS 
ALDSTNVBLtAFETVLKEIFAKVSKORONSIRTWAirLGSAQAGC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«»Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
K=Histidine, 1=3 soleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, OGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine , X=Unknoum, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPGPGEKRACCISL 


6498 


2636 


272 


SLRLCPWGTKLAGPTTMRLSSLLALIjRFALPLILGLSLGCSLSL 
LRVSW1CGEGEDPCVEAVGERGGPQNPDSRARLDQSDZDFKPR1 
VPYYRDFKKPYKKVLRTRYIQTELGSRERLLVAVLTSRATLSTL 
AVAVNRTVAHHrPRLLYFTGQRGARAPAGMQWSHGDERPAWLM 
SETLRHLHTKFGADYDWFFIMQDDTYVQAPRLAALAGHLSINOD 
LYLGRAEEFlGAGEOARYCHGGFGYLLSRSLliLRLRPHLDGCRG 
DILSARPDEWLGRCLIDSLGVGCVSOHOGOOYRSFELAKNRDPE 
KEGSSAFLSAFAVHPVSEGTLMYRLHKRFSALELERAYSEIEOL 
0A0IRNLTVLTPEGEAGLSWPVG1..PAPFTFHSRFEVLGWDYFTE 
0HTFSCADGAPKCP10GASRADVGDALETALEQLNRRY0PRLRF 
QKORLLNGYRRFDPARGMEYTLDLLLECVTORGHRRALARRVSL 
LRPLSRVE 1 LFKPYVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRLAV?LAVRAEAPSQVRLMDWSKKHPVDTli?FLTTVWTRPG 
PEVLNRCRMNAISGWQAFFPVHFQEFNPALSPORSPPGPPGAGF 
DPPSPPGADPSRGAP1GGRFDRQASAEGCFYNADYLAARARLAG 
ELAGOEEEEALEGliEVTIDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CSPRLSEELYHRCRLSNLEGLGGRAQLAMALFEQEOAKST 


6499 


. 3 


2040 


SCSADTRPSGOAWPTVGLRAAAGAFRTGSPLALGPETPQVACLP 
GHPPVR PQVSGGPGAKPDPAAHLP FFYGS I S RAE A EEK L KLAGM 
ADGLFLLROCLRSLGGYVLSLVHDVRFHKFPIEROLNGTYAIAG 
GKAHCGPAELCEFYSRDPDGLPCNLRKPCNRPSGLEPOPGVFDC 
LRDAMVRDYVR0TWKLEGEALEQA1IS0APOVEKLIATTAHERM 
PWYRSSLTREEAERKiYSGAQTDGKFbbRPRKEQGTYALSLIYG 
KTVYHYLI SQDKAGKVC 3 PEGTKFDTLWQLVE YLKLKADGLI YC 
LKEACPNSSAC NASGAAAPTLPAH PSTLTHPQRR I DTLN SDGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKXLFLKRDNL 
LIAD3 ELGCGNFGS VRCGVYRMRKKQI DVAI KVLKQGTEKADTE 
EMMREAQ2 MHQIjDNP Y IVRLI GVCQAEALMbVMEMAGGGPLHKF 
LVGKR E E 1 PV S NVAE LLHQ VS MGM KYLE 2 KN F VHRDLAARlTVliL 
VNRKYAKI SDFGLSKAIXSADDSYYTARSAGK^PLKWYAPECINF 
RKFSSRSDVW5YGVTMWEALSYGOKPYKKMKGPEVMAFIEOGKR 
MECFPECPPEIjYALMSDCWIYKWEDRPDFLTVEQRMRACYYSLA 
SKVEGPPGSTOKAEAACA 


65O0 


1773 


726 


TGPTHAS AJDAWGLVRS VTEWCANVRGN PCA^ALS CPQAVLDAGK 
MLSES S S FLKG VMLGS I FCAL I TMLGH I R I GHGNRMHHHEHHH1 
QAPNKED T LK 3 S EDE 3MELSKS FR VY C 1 1 LVKPKDVS LWAAVKE 
TWTKHCDKAEFFSSENVKVFES I NMDTNDMWLMMR KAYKYAFDK 
YRDCYNWFFLARPTTFAI IENLKYFLLKKDPSQPFYLGHTl KSG 
DLEYVGM EGG J VLS V ES MKRLNS JjLN 3 P EKC P EQGGM I W K I S ED 
KQLAVCLK YAGVFAENAEDADGKDVFNTKSVGLS I KEAMTYHPN 
Q WEGCC S D MAVT FNG LTPNQMHVMMYG VY RLJRAFG P Y FO 


6501 


1 


570 


LVGMS GGGTETF VGCEAAPGGGS K KRDS LGTAGSAHLI I KDLGE 
IHSRLLDKR P VI OGETR YFVKEFEEKRGLR EMRVLENLKNM IHE 
TNEHTLPKCRDTMRDSLSQVL0RLOAANDSVCRLQ0REQERKK1 
HSDHLVAS EK0HMLQKDNF1^KE0PNKRA£V1)EEHRKAMER1jKEQ 
YAEMEKDLAKFSTF 


6502 


213 


1650 


AGMKPDPWAGRNRTAVLPDVSVFRREDVGWWRSWLOOSYQAVKE 
KSSEALEFMKRDLTEFTQWQHDTACTIAATASWKEKLATEGS 
SGATEKMKKGLSDFliGVISDTFAPSPDKTIDCDVITLMGTPSGr 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGE1 SELLVGS PS I RALYTKM VPAAVSHSE FWHRY FY KVHOLEQ - 
EOARRDALKQRAEQS 1 SEEPGWEEEEEELMG I SP1 S PKEAKVPV 
AXISTFFEGEPGFQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QI ANPATA PE.ARVLPKDLSQKLLE ASLEEOGLAVDVGETGPS PP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nuci eotide- 
loca t ion 
corresponding 
to first, 
amino acic 
residue of 
amino acic 
sequence 


Predicted, end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, C^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
IteHistidine , I=Isoleucine , K=Lysine, 
LsLeucine, K=Methionine, N^Asparagine , 
P=Proline, Q~Glutamine, R=Arginine, 
S^Serine, T=Threonine, V= Valine , 
W=Tryptophan, Y=Tyrosine, 2=Unknowr;, *»Stop 
Codon, /=possible nucleotice deletion, 
\=possible nucleotice insertion) 








2HSKPLTFAGHTGGPEPRPFARVETLREEAPTDLRVFELNSDSG 
KSTPSNNC-KKGSSTDISEDWEKDFDLCMTEEEVQKALSICVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


65C3 


23 ;* 


1650 


AGNKPDPWAGRNRTAVLPDVSVFKREDVGWWRSWLQOSYQAVKE 

vcCTr&T.PTr«tvDrit TPPTm/vnunTar^TT a 2»T2vq\a/k , ptt7 liTPCC 
fvtior./*iJt r Pift.HULii tit lyv vxjtxu i i it\/\i v vjvt<i\ia/iir.ijo 

SGATEKMKKGLSDFLGVISDTFAPSPDKT3DCDVITLMGTPSGT 

AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 

KGE1SELLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVH0LEQ 

EQARRDALKORAEOS I SEEPGWEE ESEELMGI S P I S PKEAKVPV 

AKISTFFEGEPGPOSPCEENLVTSVEPPAEVTPSESSESISLVT 

0 1 ANPATA P E ARVLPKDLSQKLLEAS LE EQGLAVDVGE7G PSFP 

i rib T\trL)l FMbnllswrtrKrrrtKVL 1 L>HC<C>vl 1 L»1jKV r c.XiIMoL'ou' 

KSTPSNNGKKGSSTDISEDWEKDFDLDMTBEEVOMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


2131 


1294 


GKVCLVAKWVCLS 1 LS PPPAGMKT PNAQE AEGQQTRAAAGRATG 
S ANMTKKK VSQKKQRGRPS SQPCRN I VGCR 1 S HG WKEGDEP ITQ 
KKGTVLDOVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVASSHI SDANLANT3 IGKAVERMFSGEHGSKDEWRGMVLAQA 
PI MKAWFY I TYEKDP VjjYM YULiLDDYKr.GDJjK IMPELS LSPP.TE 
REPGGWEGLI GKHVE YTKEDGS KR I GMVI HQVEAKPSVYFI KF 
DDDFHIYVYDLVKKS 


6505 


2133 


1294 


GKVCLVAHWVCLS 1LS PPPAGMKTPNAOEAEGQQTRAAAGRATG 
SANMTKKKVS0KKQRGRPSSOPCRNIVGCR3SHGWKEGDEPIT0 
W KGT VLDCV P 1 N PS LY LV KYDGI DC V Y G LELHRDER VLS LK I LS 
DRVASSHI SDANLANTI 1 GKAVEHM FEGEHGS KDEWRGMVLAQA 
PlMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGWDGL1GKHVEYTKEDGS KRI GMVXHOVEAKPSVYF I KF 
DDDFHIYVYDLVKKS 


65 06 


1 


1350 


EVSPPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCFDCGST 
ELVEDSHYSQSQLVCSDCGCVVTEGvLTTTr SDEGNLjREVTybR 
STGENEQVSRSOORGLRRVRDLCRVLOLPPTFEDTAVAYYQQAY 
RHSG3 RAARLOKKEVLVGCCVLITCROKNWPLTMGAI CTLLYAD 
LDV PSSTY MQ1 VKLLGLD VP SLCLAELV KT YCS SFKLFQASPSV 
PAKYVEDKEKNLSRTMQLVELANETWLVTGRHPLPVITAATFLA 
W0SL0PADRLSCSLARFCKLANVDLPY PAS SRLOELLAVLLRMA 
EOLAWLRVLRLD KRS WKH I GDLLQKRQS LVR S AF RDGT ABVET 
R EKEP PG WG QG0GEGS VGNNSLGLPQG KR PAS FALLLPPCMLKS 
PKR I CP VP PV^TVTGDFNI SDS E I EOYI jRTPOEVRDFORAOAAR 
OAATSVPNPP 


6507 


1878 


925 


RSHASRLPELPSGCLVLQVQELVQMSGMEATVT2 P I WQNKPHGA 
AS S WRR I GTNLPLKPCARASFETLPN 3 SDLCLRD VPPVPTLAD 
I AVJ I AADEEETY AR VRS DTRPLRHTViK P S P LI VKQRN AS VPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTFDFLSPGSSNVSS PLPCFGSSFHSTTS FVI SDI TSETEVE 
VPE LPS VPLLCSAS PECCKPEHKAACSS SEEDDCVSLSKASS FA 
DMMGILKDFHRMKQSQDLNRSLLKEEDPAVMSEVLRRKFALKE 
EDISRKGN 


650E 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTFRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQVHVVVDACSSRSQVDRLVALARMRQSGAFLSTSEGL 
I LQ LVGDAVH P0 FKEZ 0KL I KE P AP DSGLLGLFQGQN S LLH 


650? 


2 


1053 


FVm PRGGR KRRRQ AAVTQ AATRASGTPS PRDGTMTQGKLS VAN 
KA PGTEGQQ0 VHGE KKE APAVPS A P PS Y EEATSGEGM KAG AFPP 
APTAVPLHPS WAYVDPSSS SSYDNG FPTGDHELFTTFSWDDQKV 
RRV FVRKVYT I LLIQLLVTLAWALr TF CD P VKDYVQAN P G WY W 
A S Y AVFFATY LTLACCSG PRRH F P WNL I LLTV FTLSMA YLTGML 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue ci 
amino acic 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ar.ino acid segment containing signal peptide 
(;.=Alanine, C^Cysteine, D=Aspartic Acid, E= 
G'; jramic Acid, F=Phenyla3anine, G=Glycine, 
HeKistidine, I=Isoleucine , K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glut amine. K=Arginine, 

Serine, T=r Threonine, v=Vaiine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
V possible nucleotide insertion) 








SSVYNTTSVbLCLGITALVCLSVrVFSFOTKFDFTSCQGVLFVL 
L^TLFFSGLILAILLPFQyVPWLHAVYAALGRGVFTLFLALDTQ 
Ll-. r -;GNRRHSLS PEE Y I FGALN I YLD3 3 Y t FT FFLQLFGTNRE 


6510 


3? 


1156 


PC;-LDGCPORGAVHPLLSSA>tGLLAFLRTOFVLHLLVGFVFWS 
G1.VINFV0LCTLALWPVSKQLYRRLKCRLAYSLWSQLVMLLEWW 
S C7 S CTLF TD 0 ATVE R FG KEHAV 1 1 LNHNFE I DF L CG WTMCE R F 
GVLGSSKV LA KKELLYVPLIGWTWYFLE I VFCKRKWEEDRDTVV 
EGLRRLSDYPEYMWFLLYCEGTRFTETKHRVSMEVAAAKGLPVL 
K YK L L PRT KG F TTA V KCLRGT V AA V Y DVT LN FRGN KN PS L LG I L 
YGK K YEAD M C VRR FPL ED I P LD E K EAAQ WLH KLYQE KDALQE I Y 

no>:gmfpgeqfkparrpvjtllnflswatillsplfsfvlgvfas 
g £ pl li ltflgfvgagnghcr 


6511 


254J 


1425 


gekoplaaaptecleqviggagdpgtwasfpsplpgpaplkggk 
tm^-tnfsdi vkogyvkmksrklgi yrrcwlvfrkssskgpqrle 
kyrdbksvclrgcpkvteisnvkcvtrlpketkrqavaiiftdd 
s ak t ftcds e leaee w yktls veclgsrlnd i slge pdllapgv 
qc l otdr fn v fllpc pnldvyg e c xlq3 then i y lwd i hn pr vk 
lv.s w plcslrrygrdatrftfeagrmcdageglytfqtqegeqi 

YOK\^:SATLAIAEQKKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
KL7- R SAYWKH I TGSQNI AEASS YAGEGYGAAQASSETDLLNRFI 
LLKrKPSCGDSSEAKTPSQ 


6 512 


1S£ 


807 


FGKKSTWFPLSRSLRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EV F 5 GSGRATG C ERGG VRGAROGRAPGS S 1 WR KEPRMVCTRKTK 
TLVSTCVILSGMTNI 1 CLLYVGWVTNYIASVYVRGQEPAPDKKL 
EE OKGDTLK 3 I ERLDHLENVI KQH I QE AP AKPEE AE AEP FTDS S 
LFAH WGQELS PEGRRVALKOFQYYGYNAYLSDRLPLDRP 


6 513 


2 


756 


FV £ F E PGFS LAQLNLI WQLTDT KQLVHSFAEGQDQGSAYANRTA 
LF P D LLAOGN ASLRLQR VRVADEG S FTCFVS 3 RDFGSAAVSLQV 
AAFYS KPSMTLEPNKDLR PGDTVT 3 TCSSYQGYPEAEVFWQDGQ 
GVFL7GNVTTS OMANEQGLFDVKS 3 LR WLGANGTYSCLVRNP V 
LCCOAHSS VT3 TPQRS PTG AVE V0 VP EDPWALVGTDATLR CS F 
SPK PG FS LAQLNLI WQLTDTKQLVHSFAEGQDOGS AYANRTALF 
PELLAQGNAS LRLQRVRVADEGS FTCFVS I RDFGSAAVSLQVAA 
PYS}:PSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGOGV 
PLa G MVTTS QMANEC<3 LFD VHS I LRWLGANGTYS CLVRNPVLQ 
ODAH S S VTI TPQRS PTGAVEVQVPEDPWALVGTDATLRCS FSP 
EPGr SLAQLNL3WQLTDTR0LVHSFTEGR 


6514 


985 


302 


VG3 } GPTI SSAAEMEDLLDLDEELR YSLATSRAKMGRRAOQESA 
0AE J vH LNGKNS SLTLTGETS S AKLPRCRQGGW AGDS VKAS KFRR 
KA5? E E I EDFRLRPQSLNGSDYGGD3 PI 3 PDLEEVQEEDFVLOVA 
APP£ I QI KR VMTYRDLDNDLMKYSAIOTLDGE 3 DLKLLTKVLAP 
EH EVRERNPSKODDVGWDWDHLFTEVS SSVLTE WDPLQTEKEDP 
AGQARHT 


6515 


1345 


305 


GRVGSRRRGAAVPGGCGAGSTQLEVSASASCGALGSADMNPIVV 
VK6C-GAGPI S KDRKER VHQGM VRAATVG YGILR EGG S AVDAVEG 
AWALEDDPEFNAGCGSVlJmiGI^VEMDASlMDGKDLSAGAVSA 
VOC1ANPI KLARLVMEKTPHC FLTDQGAAQFAAAMG VPE I PGEK 
LVTERNKKRLEKEKHEKGAQKTDCQKNLGTVGAVALDCKGNVAY 
ATS TGGI VN KMVGR VGDS PCLGAGGYADNDIGAVSTTGHGES IL 
KVNLARLTLFHIEOGXTVEEAADLSLGYMKSRVKGLGGLIVVSK 
TG DtWAKWT S TSM P WAAAKDG KLHFG I D P DDTT 3 TDLP 


6516 


1 


1402 


FR R LR Y LGQDAT AAARDLRTRGLQG Y C P S ATARQQ VLVS ALQQL 
KGPKHEHR^rENQEMPYSTNKELILGl^fVGTAGISLLLLWYHKVR 
KPG 3 TiMKLPEFLSLGNTFNSI TLODEI IIDDQGTTVI FQERQIjQI 
LEKJ.,^ELLTN^EELKEBIRFLKEAIPKLEEYIQDELGGKITVHK 
3 S PC'HRAR KRRLPTI QS SATSNS SEEAESEGGY I TANTDTEEQS 
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SEO 
JO 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue oi 
amine acic 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspertic Acid; E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H=Histidine, l=lsoleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutanine, R=Arginine, 
S^Sexine, T=Threonine, v« Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion! 








FPVPKAFNTRVEEbNLDVLLOKVDHLRMSESGKSESFELLRDHK 
EXFRDEIEFMWRFARAYGDMYELSTNTQEKKHYAMI GKTLS ERA 
INRAPMNGHCHLWYAVTjCGYVSEFEGLQNKINYGHLFKEHLDIA 
I KLLPEEPFLYYLKGRYCYTVS KLSW1 EKKMAATLFGKI FSSTV 
0EALKNFLKAEELCPGYSNPNYMY1AKCYTDLEENQNALKFCNL 
ALLLPTVTKEDKEAQKEKQKIMTSLKR 


6517 


3 


1414 


GRVWGGS s s lnamvy vrghaedy er worqgargwd y AHCLPYFR 
kaoghelgasryrgadgplrvsrgktnhplkcafleatqqagyp 
ltedmngfqqegfgwmdmtlhegkrvisaacaylkpalsrtnlka 
eaetlvsrvlfegtravgveyvkngoshrayaskevilsggain 
s pqllmls g ignaddlkklg i p wchlpgvg0nlodhlei y3 0q 
actrp i tlhs aqkplrkvc iglewlwkftgeg atahletggf i r 
sqpgvphpdiqfhflpsqv1dhgrvptqqeay0vhvgpmrgtsv 
gwlklrsanpqdhpviqpnylstetd3edfrlcvkltreifaoe 

AIiAP FRG KELQ PG S K I Q SDKE I D AFVRAKAD S AYH P SCTCKMG Q 
PSDPTAWDPQTRVLGVENLRWDAS1MPSMVSGNLNAPTIM1A 
EKAADI IKGOPALWDKDVPVYKPRTLATQR 


651£ 


242 


1096 


P AWN PGS E P RTRVKPRARSF PLP P PRAPRRRKHRLLRAVPGPSR 
RHRCRRRAPPPPSTMGDAGS ERSKAPSLPPRCPCGFWGSSKTKN 
LCSKCFADF0KKQPDDDSAPSTSNS0SDLFSEETTSDNNNTS1T 
TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVSL1TPTKRSC 
GTDSQSENEASPVKRPRLLENTERSEETSRSKOKSRRRCFQCCT 
KLELVQQELGSCRCGYVFCMLHRLPEOHDCTFDHMGRGREEAIM 
KMVKLDRKVGRS CQR I GEGCS 




3 


1113 


ERKMAKPPSP VHC VAAAAPTATVSE KEPFGKLQ LSS RD PPGSLS 
AKKVRTEEKICAPRRVT^GEGGSGGNSRQLQPPAA P5PQS YGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSgPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKE KEREKKKHK 
VMNE 1 KKENG EVK I LL»KSGKEKP KTN I EDLQ 1 K KV KKK KKKKHK 
ENEKRKRPKMYSKS 1 OTI CSGLLTDVEDOAAKG ILNDNI KDYVG 
KNLDTKJN YDSKI PENSE FP FVSLKEPR VQNNLKRLDTLBFKQLI 
H I EKQ PNGGAS V I HCLQ 


6520 


3 


1113 


ERKMAEPPS FVHCVAAAAPTATVSEKE PFGKLOLS S RDPPGS LS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPKHLLLPAAAAAASANAKS RRPKE KRE KjERRRHGL 
GGAREAGGASREENGEVKPLPRDK I KDKIKERDKE KEREKKKHK 
V14NEIKKEKGEVKILLKSGKEKPKTN1EDLQIKKVKKKKKKKHK 
ENEKRKRPKMYSKS I QTI CSGLLTDVEDQAAKG2 LNDNIKDYVG 
KNLDTKNYDSKIPEKSEFPFVSLKEPRVQNNLKRLDTLEFKQLI 
H I EHQ PNGG AS V I HCLQ 


652D 


184 


1796 


JOjFK^TDTSQGELVIIPKALPLIVGAOLIHADKLGEKVEDSTMP 

irrtvnstretppksklaegeeekfepdisseesvstveeqene 
tppatsseaegpkgepeneekeenksseetxkeekdqskekekk 
vkktlpswatlsasolaraqkqtpkiassprpkmdajlteaikac 
fqksgaswairkyi 1 hkypslelerrgyllkqalkrel.nrgv1 
kovkgkgasgsfvvvoksrktpqksrnrknrssavdpepqvkle 

DVLPLAFTRLCEPKEASYSLIRKYVSOYYPKLRVDIRPQLIjKNA 
LORAVERGQLEQITGKGASGTFOLKKSGEKPLLGGSLMEYAILS 
AIAAMNEPKTCSTTALKKYVLENHPGTNSNYQMHLLKKTLQKCE 
KNGWMEQISGKGFSGTFOLCFPYYPSPGVLFPKKEFDDSRDEDE 
DEDESSEEDSEDEEP P PKRRLOKKTPAKSPG KA-ZiS VKQRGS KPA 
PKVS AAQRG KARPLP K KAP F KAKTP AKKTRPS STV I KKP SGGSS 
KKPATSARKE 


6522 


1042 


391 


NKWLRPSPRSHRTPESGRVLSLFRLPPFGMALSGSTPAPCWEED 
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BNSDOCID: <WO_0153312A1 J_> 



WO 01/53312 



PCT7US00/34263 



SEO ! 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 

residue of 
amino acid 
sequence 


Amino acid secment containing signal peptide 

(A- t"l ani tip f — C\jcz t i n o H-Aanarfir A.r"I «-} IT— 

Glutamic Acid, F=Phenylaianine, G-Glycine, 
H=Histidine, I =j soleucine , K=Lysine, 
L* Leucine, M=Meihionine , N=Asparagine , 

P-Prol inp f)=G 1 ;i r sni np ^=Arai ni np 

S=Serine, T= Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X= Unknown, *=»Stop 
Cooon, /=possibie nucleotide deletion, 
Vpossible nucleotide insertion) 








ECLDY YGMbS LHRM F E VVGGQLTECELELliA FLLDE APGAAGGL 
SRARSGLKLLbBLERRGQCDESNljRLLGQLLRVLARHDLLPKLA 
RKRRRPVSPSRYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRORRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


3 0S7 


RSR KLG 2 FRRCWLVFKKASSKG PRRbEKFPDEKAAYFRNFHKVT 
ELHNI KNI TRLPRETK KHAVA 1 1 FHDETSKT FACES ELEAEEWC 
KHLCMECLGTRLND1 SLGEPDLIJ^AGVOREQNERFKVYLMPTPN 
LDIYGECTMQITHENIYLWDIHNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEM3 YQKVHSATLAIAEQHER 
LML5MEQKARLQTSLTEPMTLSXS3SLPRSAYWHHITRQNSVGE 
lYSLOGNHENRHSDLI'GKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6524 


2 


IC'97 


ASCCTRRRTAALDSGERIAGRRSFIALAMASNFNDIVKQGYVKI 
RS R K LGI FRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNI KNITRLPRETKKHAVAI I FKDETSKTr ACESELEAEEWC 
FJiLCKECLGTRLNDlSLGEPDLLAAGVQREONERFNVYLMPTPN 
LD I YGECTMQI TH EN 2 YLWD1 HN AKVKLVMWPLSSLRR YGRDST 
WFT FESGRMCDTGEGL FTFQTREGEMI YQKVHS ATLAI AEQHER 
LMLEKEOKARLQTSLTEPMTLSKSISLPRSAYVJiiHITRQNSVGE 
I Y S LOGNH EN RH SDLTG KSC KTS ENR FLEENAP LVMYGI THHLF 
MDTSTCKWHDLE 


6525 


3 


ifcS9 


GESPFSEEES I EFNPS S SGRSARTVSSNSFCSDDTGWPSSQSVS 
FVXTPSDAGNSPIGFCPGSDEGFTRKKCT3GMVGEGSIQSSRYK 
KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 
NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNF 
ECYLT PLQOKEVTVRKL KTKLKES ER RLHERES E I VELKSQLAR 
KiREDWIEEECHRVEAQLALKEARKEI KQLKQVI ETMRSSLADKD 
KG10KYFVD1NIQNKKLESLL0SKEMAKSGSLRDELCLDFPCDS 
PEKS LTLN P PLDTMADG LS LEEQVTG EGADR ELLVGDS I ANSTD 
LFDE1 VTATTTESGDLELVHSTPGATWLELLP1 VMGOEEGSVVV 
ERAVQTDWPYSPA1 S EL1QSVLQKLQDPCPSS LAS PDES EPDS 
MESFPESLSALWDLTFKNPNSAILLSPVETPYANVBAEVHANR 
LMR ELDFAACVEERLDG VI PLARGGWRQYWSS S FLVpLLAVAA 
PV\'PTVLWAFSTORGGTDPVYNIGALLRGCCWALHSLRRTAFR 
IKT 


6526 


2 


2024 


SGRAGEPEEWRGRQI3D£KE1*WIPFNSEDSQQLEEAYSSGKGCN 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTKFYKGDKD 
NKYVPYSESFSQVLEETYMLAVTLDEWKKKLES PNREI1ILKNP 
KLM\m r QPVAGSDDWGSTPME0GRPRTVKRGVENISVD3HCGEP 
LOIDHLVFWHGIGPACDLRFRSIVOCVNDFRSVSLNLLQTHFK 
KAQEKQQI GRVEFLPW WKSPLHS TGVDVDLQR I TLPS1NRLRH 
FTNDT I LDVPFYNSPTY CQTI VDTVASEMNRI Y TLFLQRNPDFK 
RnV^ 7 Ji^H^T^^TiT'LFPT "LtTNDKn^LGDTDSEKGSLtNI VMDOGD 
TPTLEEDLKKLQbS EFFD1 FEKEKVDKE ALALCTDRDLQE 3 GI P 
LGPRKKILNYFSTRKNSMGIKRPAPOPASGANIPKESEFCSSSN 
TRNGDYLDVGIGQVSVKYPRLI YKPEI FFAFGS PIGMFLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYR1EPMVVPGVEFEPMLI 
PKHKGRKRMHLELREGLTRMSMDLKIWLLGSLRMAWKSFTRAPY 
PAL-OAS ETPEETEAE PES TS EK PS DVNTEETSVAVKEEVLP I NV 
GMLNGGQRI DYVLQEKF 2 ES FNEYLFALQSHLCY WESEDTVLLV 
LKEIYQTQGIFLDQPLO 


6527 


1 


922 


GWUPLLSRILPSDACK 3 YKQGI NI RLDTTLI DFTDMKCQRGDLS 
FI FNGDAAPSES FWLDNEQKVY QR I H HEES EMETEEEVD I LMS 
SDI YSATLSTKS ISFTRAQTGWLFREDKTERVGNFLADFYLVNG 
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BNSDOCID: <WO_0153312A1_l_s 



WO 01/53312 



PCT/US00/34263 



SEO 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
ccr responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide I 
{A=Alanine, C=Cysteine, Ds?Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G»Glycinc, 
H=Histidine, I^lsoleucirte, K=Lysine, 
L* Leucine, K=Methionine, N^Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S^Serine, 7- Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, * =Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LVLESRKRREHLSEEDILRtfKAIMESLSKGGNlMEQNFEPIRRQ 
SLTPPPQNT1TWEEV1SAENGKAPHLGRELVCKESKKTFKATIA 
MSQEFPLGIEliLLNVLEWAPFKHFNKLREFVOMKLFFGFPVKL 
DIPVFPTITATVTF0EFRYDSFDGSIFT1PDDYKEDPSRFPDL 


6528 


1 


1073 


bTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKI lfcvlglyia j 
IPFLIKLCPGIOAKLlFLNFVRVPYFIDbKKPODQGLNHTCNYY ; 
LQPEEDVTIGVWHTVP/vWWKNAQGKDQMWYEDALASSHPl ILY ' 
LHGNAGTRGGDHRVELYKVLSSLGYHWTFDYRGWGDSVGTPSE i 
RGMT YDALHVFDW I KAR SGDN P VY 1 WGHS LGTGVATNLVRRLCE 
RETPPDALILESPFTN3REEAKSHPFSVIYRYFPGFDWFFLDPI | 
TSSGIKFAKDENVKHlSCPLblLHAEDDPVVPFQLGRKLYSIAA 1 
PARSFRDFKVOFVFFKSDLGYRHKYIYKSPELPRILREFLGKSE ' 
PEHQH 


6529 


363 


2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF 
EGVEDESFLKWFCGNVNEQNVLSERELEAFSIL0KSGKP3LEGA 
ALDEALKTCKTSDLKTPRLDDKELEKLEDEVQTLLKLKNLKIQR 
RNKCQLMASVTSHKSLRLNAKEEEATKK^KQSOGILNAMITKIS 
NELQALTDEVTOLMMFFRliSNLGQGTNPLVFLSQFSLEKYLSQE 
EOS TAALTLYTKKOFFOG 1 HE WES SNESOFFN FLKI OTPS I CD 
NQEILEERRLEMARLOLAYICAQHQLIHLKASNSSMKSSIKKAE 
ESLHSLTSKAVDKENLDAKISSLTSEIMKLEXEVTQIKDRSLPA 
WRENAOLLKMPWKGDFDLOIAKQDYYTAROELVLNOLIKOKA 
SFELLQLSYEIELRKKRDIYROLENLVQELSOSNMMLYKQLEML 
TDPSVSQ01NPRKTIDTKDYSTHRLYQVL.EGENKKKELFLTHGN 
LEE VAE KLKONI S LVQDQLAV S AQEHS F FLS KRN KDVDMLCDTL 
YQGGNQLLLSDOELTEOFHKVESQLNKLNHLLTDILADVKTKRK 
TLANHKLHOMEREFYVYFLKDEDYLKDI VENLETQS K 1 KAYS LS 
D 


6530 


128 


2986 


GAAHHGAI VQVHPLLPGSSTI MI HDLCLVFPAPAKAWYVSDI Q 
E LYI R WDKVE I G KTVKA YVR VLDLHKKP r LAK YFPFMDLKLRA 
ASPIITLVALDEALDNYTITFLIRGVAIGOTSLTASVTNKAGQR 
1 NS APQO I EVFP P FRLMF RKVTLLIGATMQVTSEGGPQPQSH I l» 
?SI SNESVALVS AAGLV0GLA1 GNGTVSGLVOAVDAETGKWI I 
SQDLVQVEVLLLRAVRI RAP! MRMRTGTQMPI YVTG I TNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEASI RLPSQYNFAMNV 
LGRVKGRTGLRAWKAVDPTSG0LYGLARELSDE1OVQVFEKLQ 
LLNPE I EAEQI LMS PNSY1 KLOTNRDGAAS LSYRVLDG PEKVP V 
VHVDEKGFLASGSMIGTST2 EVIAQEPFGANQT1 I VAVKVS PVS 
V LRVSMS ? VLHTQN KEALV AV PLGMT VTFTVH FF.DNSGDVFHAH 
SSVIjNFATN^DFVOIGKGPTNI^TCVVRTVSVGLTLLRVWDAKH 
PGLSDFMPLPVLOAI S PELSGAIWVGDVLCLATVLTS1.EGLSGT 
WSSSANS I LHIDPKTGVAVARAVGSVTVYYEVAGHLRTYKEVW 
S VPQR I MARHLH PIQTS FQEATAS KVI VAVGDRSSNLRGECTPT 
OR EVI Q ALH PETL I S C0S0FKP A V FD FP S QDVFT VE PO FDTALG 
OYFCS 1 TMHRLTDKQRKHLSMK KTALWSAS LSSSHFSTEQVGA 
E V PFS PGL? ADOAEI I»L SNHYT S S E IRV FG A? E VLENLE V KSG S 
FAVLAFAKEKSFGWPSFITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQA1AIPVTVAFWDRRGPGPYGASLF0HFLDSYQVKFFTLF 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHS PHYFAAS 
S PTSPNALP PARKAS PPSGLWS PA YASH 


6531 


845 


1425 


PSASIPPSASPDPVPDJRTCHFCLVEDPSVGCISGSEKCTISSS 
SLCMVITIYYDVK\T?FJVRGCGQYISYRC0EKRNTYFAEYWYOA 
OCCQYDYCNSWSSPOLOSSLPEPHDRPLALPLSDSQIOWFYOAL 
NLS LPLPN FhT^GTEPDG LDPM VTLS LNLGLS FAELRRMYLFLNS 
SGLLVLPQAGLLTPKPS 


6532 


2 


954 


AAG P PS E WNQDS L FP E P E PG PA P Q VLLGPQG PGL I KG VA PF TL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ' 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acic segment containing sagnal peptide 
<A=Alanine, C=Cysteine, D«Asp3rtic Acid, E= 
Glutamic Acid, F=Fhenylalanine , G=Glycine, 
H=Kistidine, I^lsoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine . 
P=Proline, 0=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valinc, 
W=Tryptophan. Y=Tyrosine f X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion; 








ITDSTGTHLVLTVTNKNAKSPGLSRGSPQO?S£0PGSFA?APSA 
QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSOOPKQQENGSSSQ 

omddiifdi l 1 osg e 1 £ adfks ppslpgkek ps p ktvcw s p laaq 
pspsaelpqaappppgspslpgrledfiiesstglplltsgkdgp 
eplsliddlksqmlssta:ldhppspmdtselhfvpepsstmgl 

DLADGHLDS MDWLELS SGG P VLSLAPLSTTAPS L FS TD FLDGHD 
LQLKWDSCL 


6533 


1798 


373 


STISWLARVEPPRRSSGVGAARLRFPGGSRPLRARACVLALAVL 
ALLERNNADSMSAKSMLCERI AIAKELIKRAESLSRSRKGG I EG 
GAKLCSKLKAELKFLQKVEAGKVAIKESHLOSTKLTHLRAIVES 
AENLE E WS VLH V FG YTDT LG E KQTLWD W AN GGHTWV KA I GR 
KAEALHNIWLGRGQyGDKS I lEQAEPFLOASHOQPVOySNPKI I 
FAFYN SVS S PMAE KLKEMG 1 S VRGD I VAVNALLDHPEELO PS ES 
ESDDEG PELLQVTR VDREN 1 LASVAF PTE I KVDVCKRVNLD j XT 
LITYVSALSYGGCHFIFKEKVLTEQAEQERKEOVLPQLEAFKKD 
KELFACESAVKDFOSILDTLGGPGERERATVlJ KRINWPDQPS 
ERALRL VAS SKINSRSLTI FG TGDTLKAI AN SG FVRAANNQ 
GVKFSVFIHQPRAbTESKEALATPLPKDYTTrSEK 


6534 


47 


596 


KATRFISAAFWLNKOGVSFAKLPHTSWSWELOTLSFLFSGDLA 
EKSLOCFPCSAMLJjELI PLLG I HFVLRTARACS VTQPDI1 11 T VS 
EGASLELRCNYSYGATPYLFWMERTVEEAFIbLVCLKPWRVASS 
LEKKEKEDESFOLbLGSRYNVLKAHCLLPLlRKLTSGDSLLSAO 
PHCPQGL 


6535 


250 


964 


blKTFFRDVAIORDLLPKEKNLETLLTLAFLEjDKAFSSHARLS 
ADATLLTSGTTATV ALLRDG I EL WAS VCDS RA I LCRKGK PMKL 
TI DHTPE RKDEKER I KKCGG FVAWNSLGQPHVKGRLAMTR SIGD 
LDLKTSGVIAEPETKRIKLHHADDSFLVLTTDG J NFMVNSQE I W 
D FVNQCH D PN EAAHA VTEOA I Q YGTEDNSTA VW P FGAWG K Y KN 
SEINFSFSRSFA5SGRWA 


6536 


242 


1174 


S bVKEMTN Q Y G I LF KOEQAH DDA I WSVAWG TN K K ENS E T WTG S 
LDDLVKVWKWRDERLDLOWSLEGHQLGWSVDJ SHTLPIAASSS 
LDAH 1 RLWDLENGKQ IKS1 DAG P VD AWTLAF £ PD SQ YLATGTK V 
G107N3 FGVESGKKEYSLDTRGKFILSIAYS PDGXYLASGAI DG I 
INI FD 1 ATGKLLHTLEGHAMP I RS LT FS P D S C LLVTASD DG Y I K 
IYDVOHANLAGTLSGHASWVLNVAFCPDDTH?VSSSSDKSVK\^W 
DVGTRTCV11TFFDHQDQVWGVKYNGNGSKI VS VGDDQEI H 1 YDC 
PI 


6537 


1638 


921 


WRFNPFPTOGPDPSLVYRPDVDPEVAKDKASFRKYTSGPLLDRV 
FTl^KLrWTHQTVDFVRSKKAQFGGFSYKKMr^iEAVPLLDGLV 
DESDPDVDFPKSFHAFQTAEGIRKAHPDKDWFHLVGLLHDLGKV 
LALFGEPOWAWGDTFPVGCRPQASWFCDSTFQOKPDLODPRY 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTLSPOSTCTR 


6538 


3345 


2412 


PYLYDFLDAL I TCQTAPEEAF I KLDGLAGMLTEOijRRIjTKOVOE 
AR HNR DDEA I K KA VNE YDETMEK Y 1 P VLMAOA K 1 Y WNLEN Y PM V 
EKIFRKSVEFCNDKDVWKLNVAHVLFMOENKYKEAIGFYEPIVK 
if H V DM 7 T .TJV Q A T VI 1 .CV Y I MTSONEKAE EU.< R K I EK EE EOL 
SYDDPNRKMYHLCIVNLVIGTLYCAKGNYEFG3 SRVIKSLEPYN 
KXLGTDTWY YAKRCFLSLLEKMSXHMI VIHDS V 2 OECVQFLGHC 
ELYGTN3PAVIEQPLEEERMHVG1CNTVTDESR0LKALIYEIIGW 
NK 


6539 


218 


339 


FLGAASPHPHFSSbAPHPDOPEFTPVQDELEA-WELWGPGV 


6540 


3 


391 


LERLWLLLLRRPEDAJ4AECPTLGEAVTDHPDRLV7AWEKFVYLDE 
KOKAWLPLTI El KDRLOLRVLLRREDVVLGRFMTPTQIGPSLLP 
1MW0LYPDGRYRSSDSSFWRLVYHIKIDGVEDKLLELLPDD 
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BNSDOC1D: <WO 01S3312A1_I_> 
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SEO I 
ID 
NO: 


Predicted 
beginning 
nucleotide 
1 oca t i on 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence. 


Predicted end 1 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acic segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=GIycine, 
H=Histicine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6541 


116b 


536 


RTbVCRRVi^MbbRKPARGRDLRGRGRGTPRGGRKGLLPTPDEFP 
RFEGGRKFDSWDGNREPGPGKEHFPJ3TPRPDHPPHDGHSPASRE 
RSSSUJGKDKjASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPX SGRSSSLDGEHHDG YHRDEPFGGP PGSGTP 

srggrsgsnwgrgsnmnsgpprrgasrgggrgr 


6542 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGRAJjYLAFLARKEGT 
KRGFLSKKTAEASRWHEKWFALyOIWLFyFEGEOSCRPAGMYIiL 
EGCSCERTPAPPRAGAGOGGVRDALDKOYYFTVLFGKEGQXPLE 
LRCEEEODGKEWMEAIHOASYADILIEREVLMOKYIHLVQIVET 
EKIAANQI.RHOLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPD1KKIKKV0S FMRGWLCR RKWKTI VOD YICS PHAESMR KR 
NQIVFTMVEAESEYVHQLYILVKGFLRPLRKAASSKKPPISHDP 
VSS I FLNS ETI MFLHE I FHQGLKARI ANWPTM LADLFDI LLPM 
LNlYQEFVRNHQYSIiQVLANCKQNRDFDiaLKQyEANPACEGRM 
LETFLTYPMFQI PRYI ITLHELLAKTPHEHVERKSLEFAKSKLE 
ELS RVMHDE VS DTEN I RKNLA1 ERMI VEGCD I LLDTSQTFI RQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGG KLHLLKTGG VLSI*1DCTLI EEPDASDDD S KGSGC-VFGHLDF 
KI VVEPPDRAAFTVVLLAPSR0EKAAWMSD1 SQCVDKIRCNGLM 
T3VFEENS KVTVPHMI KSDARLHKDDTD1CFSKTLNSCKVPQ1R 
YASVERLLERLTDLRFLSIDFLNTFLHTYRIFTTAAWLiGKLSD 
I YKRPFTS 1 P VRS LELFFATSQNNRGEKLVDG KS PRLCR KFSSP 
PPLAVSRTSS PVRARKISLTSPLNS XI GALDLTTSSS PTTTTQS 
PAASPPPH7G0IPLDLSRGLSSPEQSPGTVEENVDNPRVDLCNK 
LKRS I QKAVLES APADRAGVES S PAADTTELS PCRS P ST PRBIJR 
YRQPGGQTADNAHCSVSPASAFAIATAAAGHGSPPGF^TERTC 
DKEFI I RR TATNRVLNVLRHWVS KHAQDFELNNELKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDD0DD3 HLKLED1 IQMTDC 
MKAECFESLSAMELAEQI TLLDHVI FRS I PY EEFLGQG WMKLDK 
NERTP Y I MKTSQH FNDMSNLVASQ I MNYADVS SRANA I EKWVAV 
ADICR CLKKYNG VKE I TS ALNRSAI YRLKKT WAKVSKQTKALMD 
KLOKTVSSEGRFKNIiRETLKNCNPPAVPYLGMYLTDLAFlEEGT 
PNFTEEGLVrfFSKMRMISHIIREIRQFQOTSYRIDHOPKVAQYL 
LDKDLI IDEDTLYELSLKIEPRLPA 


6543 


1857 

; 

i 


950 


FVSGCGRAG1 GLSWAMAAEARVSRWYFGGLASCGAACCTHPLDL 
L KVHLQTC QE V KLRMTGMALR WRTDG I IALYSGLS AS LCRQMT 
YSLTRFAI YETVRDRVAKGSQGPLPFHEKVLLGSVSGLAGGFVG 
TPADLVIJVRMQNDVKLPQGQRRNYAHALDGLyKVAREEGbRRLF 
SGATMASS RG AbVTVGQLSCYDQAKQLVLSTG YLS DNIFTHFVA 
S F I AGG CATFLCQ P bDVLKTRLMNS KG E YQG V FHCAV ETAKI*GP 
LAFYKGIjV P AGI RLI PHTVLTFVFLEQLRKN FG I KVPf 


6544 


630 


79" " 


PSPCF IRSRbDGOPWMAGLEAWLSONFSLHOPOSRVRVRRAS I S 
EPSDTDPE PRTLN PSPAGWFVQQHPELELMS SFRERFGRNWLQY 
RSHLEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPOEEAR 
GPOESP0K>5SEEVRAEPOEEEEEKEGKEEKEEGEMAPl>PEAHLG , 
EGKQKECP 


6545 


176 


560 


PPHSHAALLPAAMTPLLTLILWLMGLPLAOALDCHVCAYNGDN 
CFWPMRCPAMVAYCMTTRTYYTPTR^KVSKSCVPRCFETVYDGY 
S KHASTTS CCO YDLCNGTGLATPATLALAPI LLATLWGLL 


6546 


1657 

i 


364 


HLLNGLDE VAAFFVADLGAI VR KH FC FLKCLPRVRPFYAVKCNS 
S PGVLKVLAOLGLGFSCANKAEMELVQHI GI PASKI I CANPCKQ 
I AQI KYAAKHG I QLLS FDNEKELAXVVKSHPSAiCMVLC I ATDDS 
HSLSCLSLKFGVSLKSCRHLLENAKKHKVEWGVSFHIGSGCPD 
PQAYAQS I ADARLV FEMGTELGHKMHVLDLGGGFPGT EGAKVRF 
EEIASV I K S ALDLY FPEGCGVD I FAELGRY YVTS AFTVAVS 1 1 A 
KKEVLLDOPGREEEKGSTSKTiVYHLDEGVYGIFNSVLFDMICP 
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BNSDOCID: <WO 0153312A1 J_> 



WO 01/53312 



PCT/USOO/34263 



SEC | 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue oi 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing S3 anal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Iscleucine, K=Lys3ne, 
L=Leucine, M=Methionine, N=Asparagine, 
F^Proiine, Q=Glutawine, R=Argini:;e, 
S=Serine, "^Threonine, V-Valine, 
W=Tryptophan, Y=Tyrooine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPILQKKPSTEOPLYSSSLWGPAVDGCDCVAEGLWbPObHVGDW 
LV F DNMG AYTVGMG S P FWGTQACH I TY AMSRV A WEALRRQLMAA 
EQEDBVEG VCKPLS CG WE I TD7LCVG P VFTPA SIM 


6547 


j 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCRQPPHALRPLLLLPLVIjI, 
PPLAAAAAGPNRCDTI YQGFAECLI RLGDSMGRGGELETICRSW 
NDFHACASQVLSGCPEEAAAWJES L00EAR0AP RPNNLHTLCGA 
PVHVR ERGTGSETNQETLRATAPALPKAPAPPL LAAALALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAf4HTDPD\SAAYVPlETDAEDG 
IKGCGITF7LGKGTEVGELK1LSRFQNA 


6549 


73 


1490 


ETGRVCEDAR PACGSRSRRRRKEAAPGI PTPS PSSSS PTSSRPA 
ARAFSKAP ARLSRPRAREEPPDPGRR YIQEE 1 1 OARKHKLI KMC 
SS VAAKLWFLTDRR I REDYFQKEI LRALKAKCCEEELDFRAWM 
DE WLT I EC?GNLGIiR INGELI TAY POWWRVP T PWVQS DS DIT 
VLRHLEKMGCRLMKRPQAIIjNCVNKFWTFOELAGHGVPLPDTFS 
YGGHEN FAKM IDEAEVLEFPMWKl^TRGKRG KAV FLARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGG3WGTMLRCST 
DGRMQSNCSLGGVGWJCSLSEQGXQ1AIQVSNI LGMDVCGIDLIi 
M KDDG S FCV CE ANANVG F I AFD KACN LDVAG 1 1 ADYAASLLPSG 
RLTRRMSLLSWSTASETSEPELGPPASTAVDNKSASSSSVDSD 
PESTERELLTKLPGGLFNMNQLliANE I KLLVE 


6550 


2293 


922 


FRVSRDGAPDCGIEOMGLAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVSLI CFLI ILGLVLFMVYGNVHVSTESNLOATERRAEGLYSQ 
LLGLTASQSNLTKELNFTTRAXDJilMOMWLKARRDLDRINASFR 
QCQGDR VI YTNNQR YMAA 1 1 LS E KQCRDQ F KDMH XS CDALLFML 
NQKVKTLE VE I AKEKT I CTKDKES VLLNKRVAE EQLVECVKTRE 
l^HQEROlAKEQLOKVOALCLPLDKDKFEMDLRjvLWRDSI 1 ?RS 
LDNLGYNLYHPHSSEIiAS I RRACDHMPS LMSSK VEEUARS LRAD 
IERVARENSDLOROKLEAQQGLRASOEAKQKVE K EAQAREAKLQ 
AE CSRQTQliALEE KAV LR K ERDNLAK E LEEKKR EAEQLRMELAI 
RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRXT 
LESQRPPAGIPVAPSSG 


t>551 


157 


746 


lOPPDPRNMTLAAYKEKMKELPLVSl.FCSCFLAI^LNKSSYKYE 
ADTVDLNWCVI SDMEVIELNKCTSGQS FEVILKP PS FDGVPEFW 
ASLPRRRDPSLEEIOKKLEAAEERRXYOEAELLKHLAEKREHER 
EV 1 QXAI EENNN F I KMAKEKIiAQXME SNKENRE AH LAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFIiADPLNKSSYKYE 
ADTVDLJ3WCV3 SDMEVIELNKCTSGOSFEVILKF PSFDGVPEFN 
ASLPRRRDPSLEEIOKKLEAASERRKYQEAELLKKLAEXREHER 
EVIQKA IEET^FIKMAKEKLAQKMESNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKEbKEEASR 


6553 


2 


1807 


FVWS KT*lAAHLSYGRVNIiNVLREAVRRELREFLDKCAGS KA1 VWD 
E YLTG P FG LIAQ YSI»LKEHEVE KMF T LKGNR LPAADVKNI I FFV 
RPPXBLMDI IAENVLSEDRRGPTRDFKILFVPRK SLLCEQRLKD 
LGVLGS F IHREEYSLDLlPFDGDbLEMESEGAFKECYbEGDQTS 
LYHAAKGLMTLQALYGTIPQI FGKGE CARQVANMM 1 RMKREFTG 
SONS I FP VFDNLLLLDRNVDLLTPLATOLTYEGL I DEI YG I QNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEELYAEIRDKN 
FNAVGS VLSKKAKI IS AAFEERHNAKTVGE I KOFVSQLPHMQAA 
RGS1JU>IHTSIAJELIKDVTTSEDFFDKLTVE0EFMSG1DTDKVNN 
Y I EDC I AQXHSU KVLRLVCLQS V CN SGLKQKVL.DY Y KRE I LQT 
YG YEH 1 L7LHNLEKAGLLKPQTGGRNNYPT I RKTLRLWMDDVNE 
ONPrDISYVYSGYAPLSVRLAOLLSRFGWRSIEEVLRILPGPHF 
EERQPLPTGbQKXROPGENRVTLlFTLGGVTFAElAALRFliSQL 
EDGGTE Y V XATTKLMNGTSW I EALMEKPF 
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SEO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A~Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
K=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine f M=Methionine, N=Asparagine / 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V^Valine, 
Ks Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6554 


11? 


1244 


FEMGSQVSVESGALHWIVGGGFGGIAAASQLQALNVPFMLVDM 
KDS FHHKVAALRASVETGFAKKTFIS YSVTFKDNFRQGIiWGl D 
LK-^QMVLLOGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AY EDMVRQVQRSRFI VWGGGSAGVEMAAEI KTEYPEKEVTLIH 
S 0 VALADK ELL PS VRQE VKE I LLR KG VQLLLS E R VSNLE E LPLN 
E Y R E YI KVQTDKGTE VATNLV I LCTG I K I NS SA YRKAFE S R LAS 
SGALRVNEHLQVEGHSNVYA1GDCADVRTPKMAYLAGLHANIAV 
AN 1 VNS VKQR PLQAYKPGALTFLLSMG RNDGVGQI SGFYVGRLM 
VRLTKSRDLFVSTSWKTMRQSPF 


6555 


1552 


496 


IHMALLRKINQVLLFLLIVTLCVILYKKVKKGTVPKNDADDESE 
TPEELEEEIPWICAAAGRMGATMAAINS1YSNTDANILFYWG 
LKNTLTRIRKWIEHSKLREINFK1VEFNPKGLKGK1RPDSSRPE 
LLQP LN FVRFY LPLLI HQH E KV 1 YLDDDV 1 VQGDIQE LYDTTLA 
LGHAAAFSDDCDLPSAQDI NRLVGLONTYMGYLDYRKKAI KDLG 
1SPSTCSFNPGVIVANMTEWKH0RI7K0LEKWM0XNVEENLYSS 
SLGGGVATS PMLI VFHGKYST3 NPLWHI RHLGWN PDARYS EHFL 
OEAKLLHWrCGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 




241 


14 4 9 


Mi; Jj^iVb>L_r r V 1 ti vJ_iV J. J. Lt¥Z>±j\JZ> X rVar LiUUX 1A?VLiVKv3.HKv X 

PAAL KA FRR LVNS QGQLR VP WFVTNAGNI LQHS KAQELSALLG 
CE VDADQVI LSHSPMKLFS E YHEKRMLVSGQGP VMENAQGLGFR 
NWTVDELRMAFPLLDKVDLERRLKTTPLPRNDFPRIEGVLLLG 
EPVRWETSLQLIMDVLLSNGSPGAGLATPPyPHLPVLASNMDLL 
WMAEAKMPR FGHGTFLLCLET I YQKVTGKELR Y EGLMGK PS I LT 
yOYAEDLI RRQAERRGWAAP I R KL YA VGDNPMS DVYGANL FHQ Y 
LC KATHDGA PELGAGGTRQQQPS AS QS CIS ILVCTGVYNPRNPQ 
STEPVLGGGEPPFHGHRDLCFSPGLMEASHWNDVNEAVQLVFR 


6557 


2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRJDPDKYCPSYN 
KSPQSNSPVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 
S KLQFNTTNCRS DTVMEKRS FKVPLG KGRRC WLADG FYEWQRC 
QGTNQRO P YFI YFPQI KTEKSGS IGAADSPENWEKVWDNV?RLLT 
MAG 1 FDCWEPPEGGDVLYSYT 1 I TVDSCKGLSDIHHRMPAILDG 
EEAVS KWLDFGE VSTQEALKL I K PTEN I TFHAVS S WNNSRNNT 
PECLAPVDLWKKELRASGS SORMLQWLATKSP KKEDSKT POKE 
ESDVPQVJSSQFLOKSPLPTKRGTAGLLEOWLKREKEEEPVAKRP 
YSQ 


6558 


21 


1138 


FHGRRRGGRKMELGSCLEGGREAAEEEGEPEVKKRRLLCVEFAS 
VAS CDAAVAQCFLAENDWEMERALNS Y FEPPVEES ALERRPET I 
S EPKTYVDLTNEETTDSTTSK 1 SPSEDTOOENGSMFSLITVfNID 
GLDLNNL S ERARG VCS YLALYS PDVI FLOE VIPP YYS YLKKRS S 
WYEI1TGHEEGYFTAIMLKKSRVKLKS0EIIPFPSTKMMRNLLC 
VH VNVSGNELCLMTS HLES TRGHAAER MNQLKMVLKKMQE APES 
ATVIFAGDTNLRDREVTRCGGLPIWIVDVWEFLGKPKHCQYTWD 
TOKNSNLG I TAACKLRFDR I FFRAAAEEGH T I PRSLDLLGLEKL 
DCGRFPSDKWGLLCNLDIIL 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPIAMDCCASRSCSVPTGPATTICSS 
D KS CRCGVCLPSTCPHTVWLLE PTCCDNCP P PCHI PQPCVPTCF 
LLKSCQPTPGLETLNLTTFTQPCCEPCLPRGC 


6560 


3 


1435 ^ 


TAT SGG I W LRRKW RCHWpRPLPQS CVGTEGGLQ VRDTS SRI AKG 
G VDHTKMS LHG ASGGHERSRDR RR S S DRSRDS S HERTE SQLTP C 
IRNVTSPTROHHVEREKDHSSSRPSSPRPOKASPNGSISSAGNS 
SRNSSOSSS DGS CKTAGEMVFVY EN A KEG ARN I RTS ER VTLI VD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGEYEVAE 
GI GSTVFRA I LDYYKTGI IRCPDG I S I PELREACDYLCISFEYS 
TI K CRDLS ALMHELS NDG AR RC/FEF YLEE M J L PLMVASAQSGER 
ECKIWLTDDDWDWDEEYPPOMGEEYSQIIYSTKLYRFFKYIE 
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ID 
NO: 


rlcUJ v. CVJ 

beginning 
nucleotide 
1 oca t i cr. 
corresponding 
to first 
amino acid 
residue of 
aroino acid 
sequence 


i^i eu j. c_ Leu eno 
nucleotide 
location 
corrfi spondinc 
to first 
amine acid 
residue of 
amino acid 
seguence 


(A=Aianine, C^Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
K=Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glutamine, K=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, >:= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\^pcssible nucleotide insertion) 








NRDVAKS VLKERGLKK I RLG I EGY FTY KEXVKKR PGGRPBV I YN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKS KSITNLAAAAADIPQD 


6562 




1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLSPEPSPSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGFGNLL 
AKQLVDRG MQ VLAAC FT EEGSQKLQR D7 S YRLQTTLIjDVTKS E S 
IKAAAOWVRDKVGEOGLWALVNNAGVGliPSGPNEWLTKDDFVKV 
I NVNL VG L 1 2 VTLH M b P MVKRARGR WNMS S SGG R VAV I GGG Y C 
VS KFGVEAFSDS I RRELYYFGVX VC1 1 EPGNYRTAILGKENLES 
RMRKLWERLPQETRDSYGEDYFRIYTDKLKNIMOYAEPRVRDVI 
N SM EHA 2VSRSPRIRYN PGLDAKLLY 1 P LAK.LPT P VTD FILS R Y 
LPRPADSV 


6562 




2562 


MSTLYD3 RAHKACLLRFFAS SDSNKALEQRRTLHTPKLEHL.DRV 
L YEWFLGKRS EGVPVSGPML I EKAKDFY EQMQLTEPCVFSGGWL 
WRFKARHG I KKLDAS S EKQS ADHQAAEQFCAFFRSLAAEHGLS A 
EQVYNADETGLFWR CLPNPT PEGGAV PG PKQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGIQHLFVAYKAQGNAWVDKE2FS 
DWFHHIFVPSVREKFRTIGLPEDSKAVL.LLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRIJF3NPPVPL0GPKAR 
YNMNDA1 FS VACAWN AVPSHVFRRAWRXLVJPSVAFAEG S SSEEE 
LEAECFPVKPHNKSFAHILELVKEGSSCPGOLRQRQAASWGVAG 
REAEGGRPPAATSFAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAAVAFDAVLRFAERQ P CFS AQE VGQLRALRAVFR SQQQVRRRR 
GALGAWKVEALQEG PGGCGATAQS PLP CSS TAGDN 


6563 


1329 


2694 


LARPAOPVLLRE PEG AGP P VPAGHLVHH LOGGHLRERAHPDLEA 
HEHPLPCDOMFWRQMGGHLRMVEANSRGVVWGIGYDHTAWVYTG 
GYGGGCFOGLAS STSN 1 YTQSDVKCVH 1 YENQRWNPVTGYTSRG 
LPTDRYMWSDASGLCECTKAGTKPPSL0WAV7VSDWPVDFSVPGG 
TDQEG WQY ASDF PAS YHG S KTMKD FVR R RC WAR KCKL VTSGP WL 
EVPPIALRDVSI I PESPGAEGSGHSI ALWAVSDKGDVLCRbGVS 
ELN PAGSS WLHVGTDOP FAS I S 2 GACYO VWAVARDGSAF YRGSV 
Y PS QP AGD C W YH I PS P P RQRLKQVS AG QTS VYALDENGNLWYRQ 
GITPSYPOGSSWEHVSNNVCRVSVGPLIX)VWVIANKVQGSHSLS 

SOEOEPSAPPEAHGPVCC 


6564 


j 


975 


APGSCAUWSYCGRGWSRAMRGCQLLGLRSSWPGDLLSARLLSQE 
KRAAETHFGFETVSEEEKGGKVYQVFESVAKKYDVMNDMMSLGI | 
UPVMimT.T .T .WKMHPT .POTOTjLDVAGRTfiDTAFRFLiNYVOSOHOR 
KQKROLRAQCNLSWEElAKEYONEEDSbGGSRWVCDINKEMLK | 
VG KQ KALAQG YRAGLAVJ VLGDAEEL P FDDDKFP I YTIAFG I RNV i 
THlDOAliQEAHRVLKPGGRFLCLEFSOV^NPLl SRLYDLYS FQV 
2FVLGEVIAGDWKSYQYLVESIPJ*FPS0EEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


S9S 


RSAVANGLTKRRMGLKLNGRYISLILAVOIAYIiVQAVRAAGKCD 
AVFKGFSDCLLKLGDSMAlfyPOGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCOEGAKDM W DKLR KE S KNLN I OG SLFELCG SGNGAAG S 
LLPAFP VLLVSLSAALATWLS F 


6566 


3 


1385 


KYESAOPGGTQPEPGLGARMAIHKALW.CLGLPLFLFPGAVJAQG 
HVPPGCSOG LNPLY YNLCDRSGAWG I VLSAVAGAG I VTTFVLTI 
ILVASLPFVQDTKKRSLLGT0VFFL1jGTLGLFCIjV?ACVEKPDF 

s tcas r r flfg vlfa i cfs claahvfaln flarknhgprgwvi f 
TVAULLTZjV evi intswli itlvrgsgeggpqgns sagkavas p 

CTVIANKDFWWjIYWLLLLGAFLGAWFAI.CGRYKSWRXHGVFV 
LLTTATS VAI WWW I VMYTYGN KQHNS PTWDDPTLAI ALAANAW 

AFVLFYVI pevsqvtks speqs yqgdmyptrgvgyetilkeqkg 

OSMFVENKA FSMDEPVAAKRP VS PYSG Y1CGQLLTS VYQPTEMAL 
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SEQ 
ID 
NO ' 


rreaic tea 
beginning 
nucl e o t i de 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
loca t i on 
corresponding 
to first 
amino acid 
residue cf 
amino acic 


Anino acid segmenc containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
viucduiiC ncio, r — rneny jl a jl tan jl nc t u— vJiycxrie, 
K=Histidine, I = 2soj"eucine, K=bysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, C>Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknoum, *«Stop 

rnHnn /— nnccih'lp niipl o**>t" "J r1f» rfol Ah inn 

\=possible nucleotide insertion) 








MHKVPSEGAYDIILPRATANSOVMGSANSTIiRAEDMYSAQSHOA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


TKRSNLKAYACSIHHIRTMSYVFVNDSSQTNVPLLQACIDGDFN 
Y S KRLLESG FDPMI RDS RGRTGLHLAAARGNVDI CQLLHK FGAD 
LLATDYOGNTALHLCGHVDTI 0FL VSNGLKI D I CNHQGATPLVL 
AKRRGVNKDVIRLLESLEEQEVKGFNRGTHSKLETMQTAESESA 
MESHS LLN PNLQQGEG VLSS FRTTWQEFVEDLG FWRVLLL I FV I 
ALLS LG I AY YVSG VLP F VENQPE LVH 


6558 


3 


1133 


HASDRLLVLPDN YSHFSOAS ANLQGPSRTTELFHPTLAS ISSPK 
LEGAELYFNVDHGYI.EGLVRGCKASLLTQCDYINLVCCETLEDL 
KIHL0TTDYGNFLANHTNPLTVSK1DTEMRKRLCGEFEYFRNHS 
LEPLSTFLTYMTCSYK1DNVILLMNGALQKKSVKEILGKCHPLG 
RFTEMEA\7N I AETPSDLFNAI LIETPLiAPFPQDCMSEKALDELN 
I ELLRNKLYKS YLEAFYKFCKNHGDVTAEVMCPI LEFEADRRAF 
I ITLNSFGTELSKEDRETLY PTFGKLY PEGLRLLAOAEDFDOMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVFYEREVOMMVLAFNROF 
HYGVF YAYVKLKEQE3 RNI VWI AECI SQRHRTKIN3 YI PI L 


6569 


205 


1532 


RRRGPQRLGKGRPTPLLCRWRTAGPSHWEKQARAFOGLRPVDPR ! 

RKSWLFPLTKSASSSAAGSPGGLTSLQQQKQRLIESLRNSHSSI 

AEIQKDVEYRLPFTINKLTININILLPPOFPOEKPVISVYPPIR 

HHLMDKOGVYVTSPLVKNFTMHSDLGKIlQSLLDEFWKNPPVIjA 

PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 

ADTVS S STTSHTTAKPAAPS FG VLSNLPLPI PTVDAS I PTS0NG 

FGYKMPDVPDAFPELSELSVSOLTDMNEOEEVLLEQFLTLPQLK 

QI ITDKDDLVKS I EELARKNLLLEPSLEAKROTVLDKYELLTQM 

KSTFEKXMQRO>IELS ESCSAS ALQAR1.KVAAHEAEEESDN I AED 

FLEGKMEIDDFLSSFMEKRTICHCRRAKEEKLQOAIAMHSQFHA 

PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGS KALPAP I PLHPSLQLTNYS FLQAVNT FPATVDHLOG 
LYGLS AVQTMHKNHWTLGYPNVHE ITRSTI TEMAAAQGI.VDAR ? 
? FPALPFTTHLFH P KOGA I AHVL PALH KDR PR FD FANLAVAATQ 
EDPPKMGDLSKLSPGLGSP J SGLSKLTPDRKPSRGRLPSKTKKE 
F I CKFCGRH FT KS YNLL I KE R THTDE R P YTCD I CH KAFRRQDHL 
RDHRY 3HS KEKPFKCOECGKGFCQSRTLAVHKTLHMQTSSPTAA 
SSAAKCSGETVI CGGT 


6571 


169 




APDMKRKKLQKLTDTL.TKKCKHLFRGFDKDNDGCVNVLEWIHGL 
SLFLRGSLEEKMKYCFEVFDLNGDGFISKEEMFHMLKNSLLKQP 
SEEDPDEGI KDLVE ITLKXMDHDHDGKLSFADYELAVREETLLL 
EAFGPCLPDPKSOMEFEAQVFKDPNEFNDM 


6572 




1646 


TPERAQPGALLGAAGCCVCGGRWWPRSHERGYFSSAKMGSKRRN 
LSCS E RHQ KLVDENY CKKLHVQALKNVNSQ I RNQMVQNENDNRV 
ORKQFLRLI^I^OFELDMEEAIQKAEENKRLKELOLKOEEKLAM 

QXAEKDAI KYEQMKRDAEI AKTMMEEHKRI I KEENAAEDKRNKA 
KAQYYLDLEK0LEE0EKKK0EAYEQLLKEKLMIDE1VRKIYEED 
QLEKQQKLEKMNAMRRY I EEFQKEQALWRKKKREEMEEENRKI I 
EFAKMQQQREEDRMAKVQENEEKRLQLQNALTQKLEEMLRQREP 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEOMA 
LKELVLQAAXEEEENFRKTMLAKFAEDDRIELMNAQKQRMKQLE 
HRRAVEKLI E ER RQ QFLADKQRE LE EWQLQQRROG F I NA I IEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVY0QRSE3 
CEEK 


6573 


767 


27S 


GGGGGESbSFRAOEGTRTPATDCLMYLQGPRKLMTQGGYDMVOK 
LFLDFFRRRLS QRPTAEE LEQRN I LKPRNE QE EQEEKRE I KRRL 
TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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SEO 
ID 
NO: 

i 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequenct 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glyc'ine, 
H=Histidine, I^Iscleucine , K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, O^Glutemine, R^Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=s Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTAADKVSRGECWKVGGRTVCWVSIiGSPLGSV 


6574 


204 


115$ 


LE S S V PV S VGV FVJ ACGV S V" TG AAG IX)DG ALSDTMARN AEXAMT A 
LARFR0A01>EEGKVKE^ R FFLASE CTELPKAEKWRRQI IGEI S X 
KVAQIONAGLGEFRI RDLNDEINKLLREKGHWEVRI KELGGPDY 
GKVGPKMLDHEGKEVPG^GYKYFGAAKDljPGVRELFEKEPLPP 
PRKTRAELMKAIDFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 
KWKAEREARLARGEKEEEEEEEEEINIYAVTEEESDEEGSQEKG 
GDDSQOKFIAHVPVPSQ0EIEEALVRRKKMELLQKYASETLQAO 
SEEARRLLGY 


6575 


117 


82C 


SPALASOSGGITEEKMLEPQENGVIDLPDYEHVEDETFPPFPPP 
AS PERODGEGTEPDEESGNGAP VPVPPKRTVKRNI PKLDAQRL J 
SERGLPALRHVFDKAKFKGKGKSAEDLKML1RHMEHWAHRLFPK 
LQFEDF1DRVEYLGSKKEVQTCLKRIRLDLPILHEDFVSNNDEV 
AENNEHDVTSTELDPFLTNLSESEMFASELSISLTEEOOORIER 
KKQLALERRQAKLF 


6576 


1 


1060 


P EPOALVGQKRG ALR LLVARLVLTVSAPAEVRRR VLRPVLS WMD 
RETRALADS HFRGLGVDV PGVGQA PGR VAFVS EPGAFS YADFVR 
GFLLPNLPCVFSSAFTOGWGSRRRWVTPAGRPDFDHLLRTYGDV 
WPVANCGVQEYNSNPKEHMTLRDyiTYWKEYIQAGYSSPRGCL 
YLKDWHLCRDFPVEDVFTbPVYFSSDVOJJEFWDALDVDDYRFVy 
AGP AGSWS P FHADI FR S FS WSVNVCGRKKWLLFPPGOEEALRDR 
HGNLPYDVTSPALCDTHLHPRNOlxAGPPLEITQEAGEMVFVPSG 
VfHHQVHNL VMCC FS CPL S G AFLQ EDGS TTS PLS Q PELGWN G VAK 
G 


6577 


2271 


SB*/ 


SDRMASDBFDIVIEAMLEAPYKKEEDEOORKEVKKDYPSNTTSE 
TSNSGNETSGSSTIGETSNRSRDRDRYRRRNSRSRSPGROCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGKSKSPHF 
REKSPVREPVDNLSPEERDARTVFCMQLAARIRPRDLEDFFSAV 
GKVRDVR 1 1 SDRNSRRS KG IAYVE FCE I QS VPLAI GLTGQRLLG 
VP! IVOASOAJ530mLAAMANNLOKGNGGPMRLYVGSLHFNI TED 
MLRGIFEPFGKIDNIVLMKDSDTGRSKGYGFITFSDSECARRAL 
EQLNGFELAGRPMRVGHVTERLEGGTDITFPDGDQELDLGSAGG 
R FQLMAKL AEG AG I QLPSTAAAAAAAAAAQAAALQLNGAVP USA 
LNPAALTALSPALNLASOCLQLSSLFTPQTM 


6578 


377 


1489 


PSS SATMNRA PLKRAT 1 LHMALTG ASD PS AEAE ANGE K P FL L»RA 
LQI ALWSLYWVTSI SMVFLNKYLLDSPSLRLDTP IFVTFYQCL 
VTTLLCKGLS AL AACCPGA VDFPS LRLDLR VAR S VL PLS WF I G 
MIT FNNLCbKYVG VAF YNVGRS LTTVFNVLI^ YLLLKQTTS F YA 
bLTCGI I IGGFWl^VDOEGAEGTLSWLGTVFGVLASLCVSLNAl 
YTTKVLPAVDGS I WRLTFYIWVNACILFLPLLLLLGSLQALRDF 
AQLGSAH FWGMMTLGGLFG FAIG Y VTGLQI KFTS PLTHNVSGTA 
KACAQTVIJXVLYYEETKSFLWWTSNMMVLGGSSAYTVJWGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 


2 


711 


RPPRVWYPELRFLSAAAPKWSHRTAPGIMVFYFTSSSVNSSAYT 
I YMGKDKYENEDLI KHGVgPEDIWFH VDKLS SAHVYLRLKKGENI 
EDI PKE VLMDCAHLVKANS I QGCKMNNVNVVYTPWSNLKKTADM 
PVGQIGFHRQKDVKIVTVEKKVNEILNRLEXTKVERFPDLAAEK 
ECRDREERNEKKAQIQEMKXREKEEMKKXREMDELRSYSSLMKV 
ENMSSNQDGNDSDEFM 


6580 


62 


1571 


LVALKNWKPKGTN I PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP 
RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYKDTPG 
PREALSQuRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQBLQ 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KISSSGTAKESPSSMQPOPLETSHKYESWGPLYIQESGEEQEFA 
QDPRKVRDCRIiSTOHEES ADEQKGSEAEGLKGDI I SVI I AN KPE 
1 ASLERQCVN1jE*NEKGTKPPLQEAGSKKGRESVPTKPTPGERRYI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino cCio 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Tvcid, E^ 
Glutamic Acid, F=Fhenylalaniue, G=Glycine, 
H=Histidine, Ialsoleucine , K=Lysine. 
L^beucine, M~Methionine, N=Asparagint:, 
P=Proline, 0=Glutamine, R=Arginine f 
S=Serine, T=Threonine, V^Valine, ; 
WsTryp tophan , Y=Tyrosina^ X=Unl<nown, * & Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCTKCGKAFSHSSNLTL 
HYRTHLVDRPyDCKCGKAFGOSSDLLICHORMHTEEAPyQCKDCG 
KA FS GKGS L I R H YR I H TGEKVYQ CNE CGKSFS QHAC LSS HORLH 
TGEXPYKCKECGKAFI^SSNFNKJiHKJH t <_ 
S KSNLS KHQR VHTG EG SAP j 


6581 


226 ! 476 

i 


RVFLKDLSSTPMASNNTASIAQARKLVEOIjKMEANIDRIKVSKA 
AADLMAYCEAHAKEDPLbTPVPASENPFREKKFFCAj i.- 


6582 


1428 j 718 

j 

i 
i 

! 


CFTT KTK C S PV SV P Y bS PbV LRKELES LLEN EGDQ V 3KTSSF3N 
0HP1 1 FWTbVWYFRRbDbPSNLPGLILTSEHCNEGVQLPLSSLS 
QDS KLVY 1 QbbWDN 1 NLHQE PREPLYV S VJRNFNSE KKS S LLSEE 
QQETSTLV ETI RQS 10HNNVLKP INLLSQOMKPGMKRQR S LY RE 
I LFLSLVS LGREN I DI E AFDNE YG I AYNS bS SEI LE RLQ KI DAP 
PSASVEWCRKCFGAPbl 


6583 


4 87 


41 


RIFSMTSGRbRWRCTWRPATALWSASLRLGTSSMHFSPRSISbP 
LSMMLSPbPSNTRGLSPTALFRSPDSEHATSCPRLHLWRCRAPL 
RS P S PbGRLQVbPRS P LHVHTHNS G KE VLGbQ VQRS R S G TG PAC 
SOAGSGAVQGGNWCI F 


6564 


189 


1750 


PLPMAALGPSSQNVTEYVVRVPKNTTKKyNIMAFNAADKVNFAT 
WN0ARLERDLSNKK3YQEEEKPESGAGSEFNRKLREEARRKKYG 
1 VbKEFRPEDQPWbbR VNGKSGRKFKG1 XKGGVTENTS Y Y I FTQ 
CPDGAFEAFPVHNWYK FTPbARHRTbTAEE AEEEWER RN KVLNH 
FSIMQQRRLKDODQDEDEEEKEKRGRRKA£EbRIHDI.,EDDLEWS 
SDASDASGEEGGRVPKAKKKAPbAKGGRKKKKKKGSDDEAFEDS 
DDGDFEGOEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEOS 
DfSEESEEEKPPEEDKEEEEEKKAPTPOEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGNSKPGTPSAE 
GGSTSSTLRAAASKLEOGXRVSEMPAAKRLRLDTGPCSLSGKST 
POPPSGKTTPNSGDVOVTEDAVRRYLTRKPMTTKDLbKKFOTKK 
TGLSSEQTVNVbAQI LKRLNPERKMINDKKH FSLKE 


6585 


3 


1678 


GP3 RKSR1 DDFVGGDPRAEAS CS VbHSK PHAMADSRDPASDOMQ 
HWXEQRAAQKADVI .TTGAGNP VGDKLNVITVG PRGPLLVQDWF ] 
TDEMAHFDRBRI PERVVHAKGAGAFGYFEVTHDITKYSKAKVFE | 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGIWDL 
VGNNTPIFFIRDPILFPSFIHSQKRNPQl'HbKDPDMVvvDFWSLR 
PESLHOVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANGEAVYCK 
FHYKTDQG I KNLSVEDAARLSQEDPDYG I RDLFNAI ATGKY PS W 
TFY1 QVMTFKOAETFPFNPFDbTKVWPHKDY PLI PVGKLVLNRN 
PVNYFAEVEQIAFDPSNMPPG1EASPDKMLCGRLFAYPDTHRHR 
LGPN YbH I PVNC P YRAR VANYQRDGPMCMQDNQGGAPNY Y PN S F 
GAPEQQPS ALEHS I QY SGEVRRFNTANDDNVTQVRAF YVNVbNE 
EQR KRLCENI AGHLKDAQI PI QKKAVKN FTEVH PDYGSH 1 QALL 
DKYNAEKPKNAI HTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPEQPAESTSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGb 
NbGKKISVPRDVMLEELSLLTNRGSKMFKLROMRVEKFIYENHP 
PVFSDSSMDHFQKFLPTVGGQLGTAGQGFSYSKSNGRGGSOAGG 
S GS AGO Y GS DQQHHLGSGS G AGGTGGPAGQAGRGGAAG TAG VGE 
TGSGDOAGGEGKH I TVFKT YI S PWERAMGVDPQQKMEbG I DLLA 
YGAKAEbP XYKS FNRTAMP YGG YEKAS KRMTFQMPKV 


6587 


75 


1117 


RRVPSLGKMPECWDGEHD2 ETPYGbbHWI RGSPKGNR PAIbTY 
HDVGLNHKLCFNTFFNFEDMQEITKHFWCHVDAPGO0VGASQF 
POGYQFPSMEQLAAMLPS WQH FGFKYV3 G I GVGAGAY VbAKFA 
L I FPDLVEGLVL VN I D PNGKGW I DWAATXLS GLTSTLPDTVLS H 
LFSOEELVNNTEIiVOSYRQOlGNWNQANbQLFWNMYKSRRDLD 
INRPGTVPNAKTLRCPVMLWGDNAPAEDGWECNSKbDPTTTT 
FLKMADSGGLPQVTQ?GKLTEA FKY FbQGKG YMP SAS KTRLARS 
RTA5LTSASSVDGSR?QACTHSESSEG)W5QVNHTMEVSC 
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SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino aciO 
residue o.' 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(JUAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=PhenylaIanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Va2ine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Cocon, /^possible nucleotide deletion, 
\=pcssible nucleotide insertion) 


658B 


137 


501 


LGLOAOLLEIiRTNNYQIjSDELRKNGVELTSLROKVAYLDKEFSK 
AQKALS KS KKAQEVE VLLSENEMLOAKLHSQEEDFRLONSTLMA 
E F S K LCS QM EQLE QENQQLKEG AAGAG VAQAG F 


6589 


2 


1405 


RPWG S AMATFSRQEFFQQLLQGCLLPTAQQGLDQ1 WLLLAI CUV 
CRLLVmLGLPSYLKHASTVAGGFFSLYHFFQLHKVVJVVLLSLLC 
YLVI,FL-CRHSSHRGVFLSVTlLIYLLMGEMHMVDTV , nv , HKKRGA 
OMlVAMKAVSJbGFDLDRGEVGTVPSPVEFMGYLYFVGTIVFGPW 
1SFKS YLQAVQGR PLSCRWLQKVARSLALALLCLVLS TCVGPYL 
FPYF1 PLNGDRLLRNKKRKARGTMVRWLRAYESAVS FH KSNYFV 
GFLSEATATLAGAGFTEEKDHLEWDLTVSKPLNVELPRSMVEW 
TSWKL?KSYWLNIWFKNALRLGTFSA\T.VTYAASALLHGFSFH 
LAAVLLSLAFITYVEHVLRKRLARILSACVLSKRCPPDCSHQHR 
LGLGVRALNLLFGALA1FHLAYLGSLFDVDVDDTTEEQGYGMAY 
TVKKWSELSWASHWVTFGCKIFYRLIC 


6590 


217-; 


656 


VRAY EHVLS LLENV FT PMFCHRDEY FRQLLRGAES PTRNSKLNR 
GS LS LtDD FRNTQKRGES FGI SR IGSK1 KG VFKSTTMEGAMLPN Y 
GV AF GEDDF I EEG I WMEDDS PVE AV ST PNT PRNLAAW K.I S I PY 
VDFFEDFSSERKEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 
RYLF FYVLESKLTEFHGAFPDAQLPS KRI 1 GPKNYE FLKSKREE 
FQEYLQKLLQHPELSNSQLLADFLS PNGGETQFLDKI LPDVNLG 
KI 3 KSVPGXLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSE^NKKLFNDLFKNNANRAENTERKQNONYFMEVMTVEGVY 
DY LM YVGRW FQVPDWLHHLLMGTR I LFKNTLEMYTDYYLQCKL 
EQ L FO EHR LVS L I TLLRDAI FCENTE PR S LQD XQKG AKQTFEEM 
KNTYIPDLLVKCIGEETKYESIRLLFDGL.QQPVLNKOLTYVLLDI 
VI OELFPELNKVQKEVTS VTSWM 


6591 


2171 


656 


VRAYEWLSLLENVFTPMFCHRDEYFROL'LRGAESP'rRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMIjPNY 
GVAEG E DDF I E EG I WMEDDS PVE AVS T PNTP RN LAAW K I S I P Y 
VD FFE D PS SER KEKKERI PVFCIDVERNDRRAVGHE PEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKR1IGPKNYEFLKSKREE 
FCEYLOKLLQHPELSNSQLLADFLSPNGGETQFLDK I LPDVNLG 
K12KSVPGKLMKEKGQHLEPF1MNFINSCESPKPKPSRPELTIL 
S PTS ENNK KLFNDLFKNNANRAENTER KQNQN Y FME VMT VEG VY 
DYLMYVGRWFQVPDWLHHLLKGTRILFKNTLEMYTDYYLQCKL 
ECLFOEHRLVSLITLLRDAIFCEHTEPRSLODKQKGAKQTFEEM 
MNYJ PDLLVKCI GEETKYESIRLLFDGLQOPVLNKOLTYVLLDI 
VIQELFPELNKVQKEVTS VTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAECRLKA3VAEKFAIATKEG 
DLP CV E R FFK I FPLLGLHEEGLRKFS E YLCKQ VA S XAJE ENLLMV 
LGTDKS DRRAAVI FADTLTLLFEG I AR I VETHQP I VETY YGPGR 
LYTLjKYLOVECDRQVEKWDKFIKQRDYHQQFRHVONNLMRNS 
TTEK3 EPRELDPILTEVTLMNARSELYLRFLKKR ISSDFEVGDS 
MAS E E V KOEHQKCLDKLLNNCLLS CTMQELI GL YVTM EE Y FMR E 
TVNKA VALDTYEKGQLTS SM VDDVFY I VKKC3 GRALSSS S IDCL 
CAM IN LATTELESDFRDVLCNKLRMGF PATTFODI ORGVTSAVN 
TMuccLoOGKFDTKGTF^TDEAKM^FLVTTWNVEVrSENTSTT.Tf 
KTLES DCTKLFS QG I GGEQAQAKFDSCLSDLAAVS WK FRDLLQE 
6LTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 
OFILKLEO0MAEFKASLSPVIYDSLTGLMTSLVAVELEKV\XKS 
TFNRLGGLOFDKELRSLIAYLTTVTTWTIRDKFARLSQMATILN 
LERVTE I LDYWG PNSGPLTWRLTPAEVRQVLALRI DrRSED I KR 
LRL 


6593 


3 


1837 


EAFSAGSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR 
RGGEGGGGRGRGDKRRHRQARRQRRRPEPAEARGGKMADVLSVL 
RQYN I QKKE I WKGDEVI FGE FSWPKNVXTNYWWGTGKEGOPR 
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BNSDOCID: <WO 0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predi c tec 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firsi 
amino acic 
residue o: 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I»Isoleucine f K^Lysine, 
b=beucine, M=Methionine, N=As?aragine, 
P=Proline, Q=G1 ut amine / R^Arginine, 
SsSerine, T=Threonine, V^Valint, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=?Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EYYTLDS I LFLLN>P/HbSHPVYVRRAATENI PWRRPDRKDLLG 

l LinuCAC? I orto J. Urvo/irJUH A vjIjV'" O IV' IxJv/irtUt V Ljs\J^nl\ I\r t\ J. 

EDEECVRbDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
I KAKI I^AKKRSTI KTDLDDDI TALKORS FVDAEVDVTRDIVSRE 
RWRTRTTI LQSTGKNFSKNI FAI LOSVKAREEGRAPEQRPAPN 
AA P VDPTLRTKQP 1 PAAYNRYDOERFKGKEETEGFKI DTMGTYH 
GMTLKS VTEGASARKTQTPAAOPV PR PVSQARP PPNOKKGSRTP 
I I I IPAATTSLITMLNAKDLLOOLKFVPSDEKKKQGCQRENETL 
IORRKDQMQPGGTAISVTVPYRWDOPbKLMPODWDRWAVFVQ 
GPAWOFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 


6594 


1 


109t 


EFPGRRFRGSOASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
DILSTIGYDNIIOHLNNGRKNCKEFEDFLKERAAIEERYGKDLL 
NLSRKKPCGOSEINTLKRALEVFKQOVDNVAQCHIQLAQSLREE 
AR KMEEFREKQKLQRKKTEI; I MDA 1 HKQKSLQFKKTMDAXKNY E 
0 KCR DKD E AEQAVS RS ANL VN P KQQE KLFVK LATS KTAVEDSDK 
AYMJjnlC: ri»DiCVREE>vQSEHl KACEAF EAQECERINr rRNAL>Wli 
HVWQLSQQCVTSDEMYEOVRKSbEMCSIQRDIEYFVNQRKTGQI 
PPAPIMYENFYSSOKNAVPAGKATGPNIjARRGPLPIPKSSPDDP 
' NYSLVDDYSLl.YQ 


6 595 


57 


783 


PLGTMSDSDLGSDEGLLSLAGKRKRRGNLPKESVKILRDWbYLH 
RYNAYPSEQEKLSLSGQTNbSVbQICNWFlNARRRL-LPDMLRKD 
GKDPNOFTISRRGGKASDVALPRGSSPSVLAVSVPAPTNVbSLS 
VCSMPbHSGOGEKPAAPFPRGELESPKPbVTPGSTbTljLTRAEA 
GSPIGGJjFN rPPPTFPEQDKEDF SSPC?bljVEVAIjUKA^MliIA?iv 
QODPSLPLLHTP I PLVS ENPQ 


€596 


2 


102t 


PRLPVR RYHGHRRLQGRSRGHMAEGDAGSDORON EEI EAMAAI Y 
GEEWCVIDDCAKIFC1RISDDIDDPKWTLCLQVMLPNEYPGTAP j 
F 1 YOLNAPWIiKGQERADLSNSLE E I Y I QNI GES I LYbWVEKI RD 
VL 1 QKSQMT EPGPDV KKKTEEEDVEC EDDL I LACQPESSVKALD : 
FD1 SETRTEVEVEELPPIDHGIPI TDRRSTFOAHLAPWCPKQV ; 
KMVLSKLYENKKIASATHNI YAYR I YCEDKQTFLQDCEDDGETA 

I LVEKN YTNS P EES SKALGKNKK VR KDKKRNEK 


6597 


2 


1026 


PRLPVRR YHGRRRbQGR SRGHMAEGDAGSDQRQNEE I EAMAAI Y 
GEEWCVIDDCAKI FC1K ISDDIDDPKWTbCbQVMLPNEYPGTAP 
PI YQLNAPWLKGQERAObSNSbEEI Y I QNIGES I bYLWVEKIRD 
VLlOKSQMTEPGPDVKKKTEEEDVECEDDLIbACQPESSVKALD 
FDT ^ FTRTFVF VFFT iPP 1DHGI P I TDRR^TFOAHIAPVVCPKOV 
KMVLSKbYENKKIASATKNIYAYRIYCEDKQTFLQDCEDPGETA 
AGGRbLHLME I LNVKNVMVWSRWYGG I LLGPDRFKH INNCARN 
IbVEKNYTNSPEESSKALGKNKKVRKDKKRNEK 


6598 


1099 


415 


PRVRWATTMAMSFEWPWQYRFPPFFTLOPNVDTRQKQLAAWCSb 
VLSFCRbHKQSSMTVMEAQESPLFNNVKLQRKLPVES I Ql VLEE 
LRK KGNbEWLDKS KS S FL2 MWRRPEE WGKL 1 YQWVSRSGQNNS V 
FTLYEIiTNGEDTEDEEfHGLDEATbLRALQALQQEHKAEIlTVS 
DGPRRQVLLAGTCLPbLLTS HLS RAFKRRQTOCP PKTGSVTPPD 
SKGLOS 


6599 


164 


I59j 


KJ4AALTTLFKYIDENQDRYIKKLAKWVAI0SVSAWPEKHGEIRR 
MME VAAADVKOLGGSVELVD I GKQKbPDGSE I PLP P I bbGRLGS 
D POKKT VCI YGHbDVQP AAL EDG WDSE P FTLVER DGKLHGRGST 
DDKGPVAGV7INALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
LIFARKDTFFKDVDYVCISDNYWbGKKKPCITYGLRGICYFFIE 
VECSNKDLHSGVYGGSVHEAMTDbl LbMGSLVDXRGNILI PG1W 
EAVAAVTEEEHKLYDDI DFDI EEFAKDVGAQI LLKSHKKDILMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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BNSDOCID: <WO_0153312A1J_> 
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SEO 
I'D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence. 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine , KsLysine, 
L= Leucine, M=Methionine , N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine. X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WGEQVTS YLTKKFAELRS PNE FKVYMGHGGKPW VSD FS HPHYL 
AGRRAMKTVFGVEPDLTREGGS I PVTLTFQEATGKNVMLLPVGS 
ADDGAHS ONE KLNRYWY IEGTKMLAAYLY BVSQLKD 


6600 




934 


PGRLFRVAAr^IESAGLEQLLRELLLFDTERIRRATEQLQIVLRAP 
AALSALCDLLASAADPQIROFAAVLTRRRLNTRVJRRIjAAEQRES 
LKSLILTALQRETEHCVSLSLAQLSATIFRKEGLEAWPQLLQLL 
OHSTHSPHS PEREMGLLLLSVWTSR PEAFQPKHRELLRLLNET 
LGEVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMLVPKLIMAMO 
TLI P IDEAKACEALEALDELLESEVPVITPYLSEVLTFCLEVAR 
NVALGNAIR IRI LCCLTFLVKVKS KALLKNRLLATLAAHPrPHC 
GC 


6601 


525- 


142C 


PRAA ARAP P PAV LRR DRRAATAPG AG E MTLHG PLAQR Y FLNH I E 

rvj. i. i * j\ ii* v> jt Linriih* i-iri r r\ vojirv tr \^ jvoi'irt v it )~> v i% 

NHQHQQCMA P S TLSQQNHP TON P P AG LMSK PN ALTTQQQQO0KL 
RLORIQMERERIRMRQEELMROEAALCROLPMEAETLAPVOAAV 
NPPTMTPDMRSITNNSSDPFLNGGPYHSREOSTDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGQTPMN I NPQOTRFPDFLDCLPGT 
NVDT fiTLESFnij T PT . FNnVT* 1 ? ALN K <J ^ PFLTWL 

J. » V U±J\J X U£fO£iL/JJ X rij ri*UV OOAUlvA »J £r C XJ1 FliJ 


6602 


127 


617 


LLDFPALPKFVLAQSPKAGKPSTMTSMTOSLREVIKAKTKARNF 
ER VLGKI TLVSAAPGKVI CEMKVEE EHTNA1GTLHGGLTATLVD 
NISTI4ALLCTERGAPGVSVDMNITYKSPAKLGEDIVITAHVLKQ 
GKTLAFTS VDLTNKATGKLI A0GRHTKHLGK 


6603 


lb 


660 


PVGPSSIjAARTG1X5HLPFLHRLASSRGLD^LLOFLAFI.iFVLLL 
HQOYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 

afsh wentaffgdwlrfpri vhy y fdhnsnwnlli r wg isfc 
nctgvfnqgphs ? I lslm 


6604 


3 


68£ 


tstaqrqggermsfrgggrggfnrggggggfnrggssnhfrggg 
ggggggnfrgggrggfgrgggrggfnkgqdqgppervvllgefl 
hpceddi vckcttdenkvpyfnapvylenkeq3 gkvdei fgqlr 

nFYF < ?VKIi c ?FNMK^C:cFVKT,OKF*YTnPYKIiT ■PI.OSFI.iPRPPGFK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


B46 


SGSRRGAMRAAGVGLVDCHCHLSAPDrDRDLDDVLEKAKKANW 
ALVAVAEHSGEFEKIMQLSERYNGFVIrPCLGVHPVQGLPPEDOR 
SVTLKDLDVALPIIENYKDRLLAIGEVGLDFSPRFAGTGEQKEE 
QRQVLIRQIOLAKRLNIjPVI^VHSRSAGRPTINLLOEQGAEKVLL 
HAFDGR PS VAMEG VRAGYFFS I PPS 1 I RSGQQKLVKQLPLTS I C 
LETDS PALGPEKQVRNEF WN I S I S AE Y I AQVKGI S VEE VIE VTT 
0MALKLFPK1RHLLQK 


6606 


2 


1685 


FVEIRPRAEVANLSAHSASPI0DAVLKRLSLLED1VYRQLNGLS 
KSLGLI EG YGGRGKGGLPATLS PAEEE KAKGPHEKYGYNS YLS E 
XI SLDRS I PDYRPTKCKELKYSKDLPOISI IFIFVNEALSVILR 
S VHSAVNHTPTHLLKEI ILVDDNSDEE ELKVPLEEYVHKRY PGL 
VKWRNQKREGL1 RARI EGWKVATGQVTGFFDAHVEFTAGKAEP 
VLSR1 QENRKRVILPS IDNI KQDNFEVQRYENSAHGYSWELWCM 
YISPPKDIWDAGDPSLPIRTPAMIGCSFVVNRKFFGEIGLLDPG 
MDVYGGENIELG1KVWLCGGSMEVLFCSRVAHIERKKKPYKSNI 
GFYTKRNALRVAEWJMDDYKSHVYIAWNLPLENPGIDIGDVSER 
RALRKSLKCKNFQWYLDHVYPEMRRYKNTVAYGELR1WKAKDVC 
LDQG P LENHTAI L Y P CHGWG POLAR Y TKEGFLHIjGALGTTTLL P 
DTRCLVDNSKSRLP0LLDCDKVKSSLYKRWNFIONGA1MNKGTG 
RCLEVENRGLAGIDLILRSCTGQRWTIKNSIK 


6607 


137 


986 


VPACAGLKKEARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 

gi s fcgrggagpg vptrtovfaamgavmgtfsslqtkqrrpskd 

KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
ID 
NO: 


Predicted 
beginninc 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Am'.no acid segment containing signal peptide 
(^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L-Leucine, K=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V^Valine, 
W=Tryptophan, Y»Tyrosine, X^Unknown, *=Srop 
Codon, /-possible nucleotide deletion, 
\spossible nucleotide insertion) 








SGVVNEDTFKQIYAQFFPHGDASTYAHYLFNAFDTTQTGSVKFE 
BFVTAXjSI LLRGTVHEfCLRWTFNLYD I NKDGYINQEEMMDI VKA 
IYDMI4GKYTYPVLKEDTPR0KVDVFF0KMDKNKDGIVTLDEFLE 
SCQEDDNI MRSLQLFQNVM 


6608 


224 


114C 


RPCFSSPTGLCPRLSYPMIL.L0HAV1.PPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPLAGEEELSKGGEQDCALEELCKPLY 
CKLCNVTLNSAOOACAHYQGKNKGKKLRNYYAANSCPPPARMSN 
WEPAATFWPVPPCMGSFKPGGRVILATENDYCKLCDASFSSP 
AVAOAHYQGKNHAKRLRLAEA0SNSFSESSELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSG P YFNPRSRQR I PRDLAMCVTF SGQ FYC 
SMCNVGAGEEME FRCHLESKQHKSXVS EQRYRNEMENLGYV 


6609 


1 


4 45 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEORLSALTLLSWSAVT 
PAAEPGNFOI*SFAEPRGPIjASPVRAAPRAPCPAAEMSELNTKTS 
PATNQAAGQEE KG KAGNV K KAEEEE E 1 D 1 DLTA P ETEKAALAI Q 
GKFRRFOKRKKDPSS 


6610 


31S 


883 


GRKS LCNLH 1 FIR FPLTYPDMYWGMMCTAKKCG I R FQPPAI I LI 

VP C P 7 YC1 V T T> DD T MOWN PC V P C T*H^TT? li A POT . V NTtf DO H V Q VT .PH 

vslrqleklfsflrgylsgoslaetmeoiorettidpeedlnkl 
ddkelakrksimdelfeknqkkkddpnfvydievefpqddqlqs 
cgwdtesadef 


6611 


976 


212 


PGCSGAGSRVWWIjPALRHUAMGSTESS egrrvs fgvdeeervrv 
ux?^lsenvvwrmkepssp?paptsstfglqdgn^raphkest 

T.DT5CfiCCO/^onDcr?MVT?r , WDVPn'K , u2i a i nnvT.PfM/avpppPiva 

tkhskaslptgegsisheeqksvrlarelesreaelrrkdtfyk 
eqlerierknaemyklsseqfheaaskmesrikprrvepvcsgl 
qac i lhcyrdrpkevllcs dlvka yqrc vs aahkg 


6612 


1724 


992 


VSTHASALSRT0GOPQROPRAAASGAGAGTAGGGGSGGAEGSKM 

SSKTAAKLSTSAKRIQKELAEITLDPPPNCSAGPXGDNIYEWRS 
Tl LGPPGS VYEGGVFFLD3 TFS PDYPFXPPKVTFRTRI YHCNIN 
SOGVI CLDI LKDNWSPALTI SXVLLS I CS LLTDCNPADPL VGS 1 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


749 


ELELSSNMPEQSKDYRVAVFGAGGVGKSSLVLRFVKGTFRESYI 
PTVEDTYRQVISCDKSICTIiQITDTTGSKOFPAMQRLSISKGHA 
FI LVYS I TSRQS LEELKP I YEO 1 CEI KGDVSS I P 2 MLVGNKCDE 
SPSREVQSSEAEALAR'J'WKCAFMETSAKLNHNVKELFQELLNLE 
KRRTVSLOI DGKKSXOQKRKEKLKGKCVI M 


6614 


3 


1191 


SSAAEAMR VLVRR CWGPPLAHGARRGR PS PQWRALARLGWEDCR 
DSRVREK? P WRVLFFGTDQFAREALRALHAARENKE EELIDKLE 
WTMPSPSPKGLPVKQYAVQSQLPVYEWPDVGSGEYDVGWASF 
GRLLNEALI LKFPYG ILNVHPS CLPRWRG PAPVI HTVLHGDTVT 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVbSRLGAN 
ML I S VLKNL PESLSNGRQQPMEG ATYAP K I S AGTS C 1 KWEEQTS 
EQIFRLYRAIGNI I PJjQTLWMANTI KLLDLVEVNSS VLADPKLT 
GQALlPGSViyHKOSQILLVYCKDGWIGVRSVMLKKSLTATDFY 
NGYLHPWY0KNS0A0PSOCRF0TLRLPTKKKQKKTVAMOOCIE 


6615 


832 


3b 


GR VGAGASAMSELFGDVRAFLREH PSLRLOTDARKVRCI LTGHE 
LPCRLPELOVYTRGKKYORLVRASPAFDYAEFEPHIVPSTKNPH 
QLFCKLTLRHINKCPEHVLRHT0GRRY0RALCKYEECQKOGVEY 
VPACLVHRRRRREDOMDGDGPRPREAFWEPTSSDEGGAASDDSW 
TDLYP PE L FTR KDLGS TEDGDG TDD FLTD KEDEKAKP PREKATD 
EGRRETTVYRGLVOKRGKKOLGSLKKKFKSHHRKPKSFSSCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNC3LCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLSTPS3CRS 
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SEO 

][; 
NC: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
CO first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alaninc, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, 3=1 soleucine, K=Lysine, 
L^Leucine, K=Methionine, N=Asparagine, 
P=Proline, Q=GIutamine, R»Arginine, ! 
S=Serine, ^Threonine, WValine, 
W=Tryptophan, Y--= Tyrosine, X^Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAP WP PKGLV PAV LWGLSLFLNLPGPI WLQPSPP PQSS P 
P PQPH PCH TCRG LVDS FN KGLERTI RDN FGGGNTAVJ EEENLSKY 
KDSETRLVEVLEGVCSKSDFECHRLLELSEELVESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EG TRGG S GH CDCQAG YGGE ACGQCGLG y FEAERNAS HLVCS AC F 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEGANCGAD 
QFCVNTEGS YECRDCAKACJbGCMGAGPGRCKKCS PGYQQVGSKC 
LDVDECETE VCFGEN KQCENTEGGYRC I CAEGYKQMEG I CVKEQ 
I PES AGFFS EMTEDELWLQQMFFGI 1 1 CALATLAAKGD t>V FT A 
I F I GAVAAMTGY WLS ERSDRVLEGFI KGR 


6617 


118 


673 


VWMAWQVSLLELEDRLQCF 1 CLEVFKESLMLQCGHSYCKGCJjVS 
bS YHLDTKVRCFMCWQAVDGSS SLPNVSLAW I EALRLPGDPEP 
KVCVHHRN P L.SL FCEKDQEL I CGLCGLLG S HQHHP VTP I ST VCS 
RMKEEIiAALFSELKOEQKKVDELIAKLVKNRTRlDGSAPSLCPC 
LGPATFTFL 


6618 


54 8 


136 


DG KVARRAPNS PAFQNDI Y PL VSAPRATTAESPWSKVLQNTQCR 
NVPKMTSERSRIPCLSAAAAEGTGKKQOEGRAMATLDRKVPSPE 
AFLGKPMSSWIDAAKLHCSDNVPUBEAGKEGGKSREVMRbNKEA 
WKYGT 


6615 


246 


842 


PASS E VLTAAVMFLLLNCI VAVSONMG I GKNGDLPRPPLRWEFR 
YFORMTTTSSVEGKONLVIMGRKTVJFSIPEKNRPLKDRINLVIjS 
RELKEPPQGAHFLARS LDDALKLTER PELANKVDMI WI VGGSS V 
Y KE AMNHLGH LKLFVTR IMQDFESDTFFS E IDLEKYKLLPE YPG 
I LSDVQEGKHI KYKFEVCEKDD 


6620 


3 


1B7S- 


NSRVDDFVARARKAAENEASQESALGAYSPVDYMS1TSFPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 
MGSQDGSPLRETRKDPFSAAAAECSCROBGLTV1VTACLTFATG 
VTVALVMQ I Y FGDPQ I FQOGAWTDAARCTSLGI EVLS KQGSS V 
DAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHLIDFRES 
APGALREETLQRSWETKPGI.LVGVPGMVKGLHEAHQLYGRLPWS 
OVLAFAAAVAODGFNVTHDLARALAEQLPPNMSERFRETFLPSG 
RPPLPGSLLKKPDLAEVLDVLGTSGPAAFYAGGNLTLEMVAEAO 
HAGGVITEED?SNYSALVEKPVCGVYRGHI>VIiSPPPPHTGPAljI 
SALN ILEG FNLTSLVS REQALHWVAETLK 1 ALALASRLGDPVYD 
STITESMDDMLSKVEAAYLRGHINDSQAAPAPIjLPVYELDGAPT 
AAQVLIMGPDDFIVAMVSSLNQPFGSGL1TPSGILLNSQMLDFS 
WPNRTANHSAPSLE^SVQPGKRPLSFLLPTVVRPAEGLCGTYLA 
LGANGAARGLSGLTOVRFTPWLAFFSREPSCGLDCRCLSYLWLV 
SIPHAANMG 


6621 


1 


662 


VOG I TS YQQR LOALRKEKSR DAARSRRGKENFE FYELAKLLPIj P 
AAI TSQLDKAS I IRLT I S YLKMRDFANOGDPPWNXRNEGPPPNT 
SVKVI GAQRRRSPS ALA1 EVFEAHLGSH 1LQSLDG YVFALNCEG 
KFLYI SETVS I YLGLSQVELTGSS VFDYVHPGDHVEMAEQLGMK 
LPPGRGLLSOGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 


2 


31 S 


GRASGAQEETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN 
AEVGKSCI I KRYCEKRFVSKYLATIGIDYGVTKVHVRDREIKVN 
IFDMAGHPFFYEVRKPF 


6622 


1886 


189 


KALF2KVK KFR LHVEEGD I LYAMYVRQT VL KVI KFL1 1 IAYNSA 
LVSKVQFTVDCNNTOIODMTGYKNFSCNHTMAHLFSKLSFCYLCF 
VSIYGLTCLYTLYWLFYRSLREYSFEYVROETGFDDJPDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNNEWTPDKL 
ROKLQTNAHNRLELPLI MLSGLPDTV FEITELQSLKLEI I KNVM 
1PATIAQLDNL0ELSLHOCSVK1HSAALSFLKENLKVLSVKFDD 
MRELPPWMYGLRNLEELYL.VGSLSHDISRNVTLESLRPLKSLKI 
LS I KSNVSKI POAWDVS 5HLQXMC I HNDGT KLVMLKNLKKMTN 
LTELELVHCDLERIPHAVFSLLSLQEIjDLKENNLKSIEEIVSFO 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seoment containing signal peptide 
(A^Alsnine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysint , 
L=Leucine, ^Methionine, N=Asparacine. 
P=Prcline, Q=Glutamine, R=Arginint , 
S=Serine, T= Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deleticn, 
\=pcssible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKLTSLERLSFS'B^KIEVLPSH i 
LFLCNKIRYLDLSYNDIRF1PPEIGVLQSLQYFS1 TCNXVESLP 
DELYFCKKLKTLKJGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI ! 
LPPELGD CRALKRAGLVVEDALFETLPS1>VRE0KKTE 


6624 


218 


1766 


GS RR GGG S R I P AVS TH VAPGRS VLRP FASG ALRLR S LV KALGGC 
RGRPSGLAHLSOETSHWRAKRSGRACLGDFPGEILRSFIMKCTA 
REWLRVTTVLFMARAI PAMWPNATLLEKLLEKYKDEDGE WW I A 
KQRG KRAITDNDMOS I LDLHNKLRSQVYPTASNMEY MTWDVELE 
RSAESWAESCLWEHG PAS LLPS IGQNLGAHWGR YR V PTFHVQSW 
TOEVKDFSYPYEHECNPYCPFRCSGPVCTHYTOWKATSNRIGC 
AINLCKNMNIWGOIWPKAVYLVCNYSPKGNWWGHAPYKHGRPCS 
ACPPSFGGGCRENLCYKEGSDRYYPPREEETNEIEhQQSQVHDT 
HVRTRS DDSS RNE VI S AOQMSQ I VS CE VR LRDQ C KG7TCNR YEC 
PAGCLDSKAKVIG5VHYEMQSSICRAAIHYGIIDNPGGWVDITR 
QGRKHY FI KSNRNG I QTIGKYOS ANS FTVSKVT VCAVTCETTVE 
QLCPFKKPASHCPRVYCPRKLYASKSTLCSCNWWSSLF 


6625 


1124 


543 


PGPRGGGGSLLSTKA1XSRSRGLGMHPGPSSGGTEGGVPTALRPP 
GPLVPSTS DDNLLKN 1 ELFCKLALRFHGRLLFLKDVLGDEI CCW 
SFYGOGRK1AEVCCTS I VYATEKK0TKVEFPEAR3 FEETbNILI 
YET PRGFDP ALLEATGGAAG AGG AGRGEDEENR EKK VRR I HVRR 
HITHDERPHGQQIVFKD 


662G 


3 


1498 


SAVEFVYTDRFHLILGISVEFLCSLRSDATMES I TACLHALQAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPS30LASLEW 
RQI 1 CAAQEHVK EKR RS AE VDDG AAE KE TLPEFGEG KD TGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQILLED 
GSRLVSAALVI LSEL PA VCS PEGS ISILPTILYL Tl G VLRETAV 
KLPGGQLSSTVAASLQALKG 1 LSS PMARAEKSRTA KTDLLRS AL 
TO LDCWDPVDETHOELDEVSLLTAITVF3LSTS PEVTT I PCLQ 
KRC I DKFKATLE I KDPWQI KTYQLLHS I PQY PN P AVS Y P Y I YS 
LAS C I MEKLQE IDKRKPENTAELE I FQEG1KVLETLVTVAEEHH 
RAOLVACLLPI LI SFLLDENSLGSATS IMRNLHDFALQNLMQIG 
PQYSS VFKSLVAS S PALKARLEAAI KGNQESVKVK 1 FTS KYTKS 
PGKKSS I QL1CTS FL 


6627 


1 


697 


GIPHLSSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMLL 
GDTG VG KTCFL 1 0 FKDGAFLSGTF I ATVG I D FRNKWT VDG VR V 
KLO I WDTAGQE RFRS VTHAY YRDAQALLLLYD J TNK£ S FDN I RA 
WLTEIHEYA0RDWIMLLGNKADMSSERVIRSEDGF7LAREYGV 
PFLETSAKTGIWVELAFLAIAKELKYRAGHQADEPSFOIRDYVE 
SQKKRSSCCSFM 


6628 

s 


1 


1861 


QCAE FGGGSGGGGGSGGGGSGGGRG AGGEENKEK Ex FSAGS KAN 
KEPGDSLSLE1LQIIXESQQ0HGLRHGDFQRYRGYCSRR0RRLR 
KTl^FKKGNPJiKFTGKKVTEELLTDimYliLVIiKDAERAWSYAM 
QLKOEANTEPRKRFHLLSRLRKAVKHAEELERLCES-<rRVDAKTK 
LEAO AY TAYLSGMLRFEHQ EWKAAI E AFNKCKT I YE KLAS AFTE 
EQAVLYNQRVEEISPNIRYCAYNIGDQSAINELMQKRLRSGGTE 
GLLAEKLEAL2 TQTRAKQAATMSEVE WRGRTVP VK I DK VR J FLL 
GLADNEAAI VQAESEETKERLFESMLS ECRDAI QWK EELKPDQ 
KQRDY I LEGE PG K VSNLQ YLHS YLTYI KLSTA3 KRKENKAKGLQ 
RALLQOCPBDDSKRSPRPQDLIRLYDIILQNLVELLOLPGLSED 
KAFQKE1 GLKTLVFKAYRCF FI AOS YVLVKKWSEALVL YDRVLK 
YANEVNSDAGAFKNS LKDLPDVQELI TQVRSEKCS LGAAAI LDA 
NDAHOTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 
GFQPI PCKPLFFDLALNHVAFPPLEDKLEQKTKSGL VGYI KGI F 
GFRS 


| 6629 

1 


5653 


4S49 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFEDFFETSEPVWILG 
RKYS 1 FT EKDE ILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
IE 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=r 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I»ysine, 
L=Leucir.e, K=Meth:onine, N=Asparagint , 
F=Proline, 0=Glutemine, R=Arginine, 
S=Serine, T- Threonine, V=Valine. 
W=Tryptophan, Y^Tyrosine , X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLRCGOM 1 FAQAbVCRHLGRDWRWTORKRQ PDS Y FS V LNAFI DR 
KDSYYSIHOIAOMGVGEGPCSIGQWYGPNTVAQVLKKLAVFDTWS 
SLAVHI AKDN7WMEE1 R R LCRTS VPCAGATA FPADSDRHCNGr 
PAGAEVTNRPSPWRPLVLLIPLRLGLTDINEAYVETLKHCFMMP 
0SLGVIGGKPNSAHYFIGYVGEELIYLDPHTTQPAVEPTDGCF1 
PDESFHCQKPPCRMSIAEIjDPSIAVVRGGHLSTQAFGAECCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCGG I RRRS AWGAMPGR H VSR VRALYKRVLQLHRVliP PDLKS 
LGDQYVKDE FR R H KTVGS P EAQR FliQE W KV Y ATALLQOAN ENRQ . 
N STGX ACFGTFLPEEKLNDFRDEQ1 GQLQELMQEATKPNUOFS 1 
SBSMKPKF 


6 631 


2 


423 


LVOCGGTRKR^AWGAMPnRHVSRVRALYKRVLOLHRVLPPDLKS 
bGDQYVKDEFRRHKTVGSDEAQRFl>QEWEVYATALLQQANENRQ j 
NSTGKACFGTFLPEEKLNDFRDEQI GQLQELMQEATKPNRQFS I 
SESMKPKF 1 


OOJt 


1273 


568 


GIC^U?ISLEDTQKELEHM'\mKILNLRVFEDESGKHvJSKSVMD j 
K0YE I LCVSOFTLQCVLKGNKPDFHLAMPTEOAEGFYNS FLEQL | 
RKTYRPELI KDGKFGAYMCVHIQNDGPVTIELESPAPGTATSDP 
KOLSKLEKOQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
CV5SEREP 


6633 


1145 


617 


ATGRHEGVPTLEGIIQOliVNGIITPATIPSLGPWGVLHSNPMDY • 
AWGANGbDAl 1 TQLLNQFENTGPPPADKEKI QALPTVPVTEEJ3V 1 
GSGLECPVCKDDYALGERVRQLPCNHLFHDGCIVPWLEQHDSCP 


6634 


1 


1134 


CGGI PRKGSGPRRRLPKARLRDCLPRL»MLTLRSLL?WSL,VYCYC j 
GLCAS I HLLKbLWSLGKGPAQTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPLKLIjLHGFPEFWYSWRYObRHF ' 
KSEYRWALDLRGYGETDAPIHRONYKLDCLITDIKDIbDSLGY , 
S KCVL I GHDWGGM IAWLI A I CYPEM VMKLI VINFPH PNVF7EYI 
T.RHPAQLLKSSYYYFFOIPWFPEFNFSINDFKVLKHLFTSHSTG 
I GRKGCOLTTEDLEAYIYVFSQPGALSGPINHYRNI FSCLPLKH 
HMVTT PTLLLWG ENDAFME VEMAEVTR F YVKN YFRLT IL S EAS H 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQLASMLRV^TRAWRLPREGLGPHGPSFARVPVAFSSSSG 
GRGGAEPRPLPLS YRbI»DGEAAljPAWFljHGliFGSKTNFNS I AK 
I LAOQTGRR VLT VDARNHG DS PHSFDMS YEIMSQDLQDLLPQLG 
LVPC\nA/GHSMGGKTAMLLAUJRPELVERLIAVDISPVESTGVS 
HFATYVAAJ^INIADE^PRSRARKLADEOLSSVIQDMAVRQHL 
LTNLVEVDGRFW7RVNLDALTQHLDKILAFP0R0ESYLGPTLFL, 
LGGNSQFVHPSKH PE I MRLFPRAQMQTVPNAGHWI HADRPQDF I 
AAIRGFLV 


6636 


1514 


1801 


SFCMFSHK0DSKFOAVPV0EKKKRLRRAPWRAFAQPORLKHPAE 
OPIVRQCLORPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHDGTCVLDKAGSYKCACLAGYTGQRCENLLBAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRMAKIGT 
WS FFCNNS YVLSGNSKRTCQQNGEWSGKQPICIKACREPKI SP 
LVRRRVLPMOVOSRETPLKOLYSAAFSKQKL0SAPTKKPALPFG 
DLPMGYOHLHTOliQYECISPFYRRIoGSSRRTCLRTGKWSGRAPS 
CI P ICGKI ENI TAPKTQGLRW PWQAAI YRRTSGVHDGSLHKGAW 
FLVCSGALVNERTWVAAHCVTDLGKVTMIKTADLKVVLGKFYR 
DDDRDE KTI OSLO I S A 1 1 LH PNYDP I LLDAD I AI L KLLDKAR I S 
TRVQPICLAASRDLSTSFOESHITVAGVTNVLADVRSPGFKNDTb 
RSGWSWDSLLCEEQHEDHG IPVSVTDNMFCASWEPTAPSDI C 
TAETGGI AAVS FPGRASPEPRWHLMGLVSWS YDKTCSHRIiSTAF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acic 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
2 ocation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanir.e, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F* Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K^Lysine, 
L=Leucine, M=Methionine , N=Asoaragine , 
P=Proline, Q=Glutamine, R=Arginine, j 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TKVLPFKDW1ERNMK 


6636 


13S1 


224 


GGIPOAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 
PNSDIDLSNLERLEKYRSFDRYRRRAEQEAQAPHWWRTYREYFG 
EK7DPKEKID3GLPPPKVSRTQQLLERKQAIQELRANVEEERAA 
R LRT AS V PLDAVRAE WERTCG PYH KQRLAE Y YGL YRDL FHG AT V 
V P R V PLH VAY AVG E DDLM P VYCGN E V T PTE AAQAP E VT YEAEEG 
S LWT LLLTS LDGHLLE PDAE YLHWL L TN I PGNR VAEG QVTCP YL 
P PFPARGSG J HR L^FLLFKQDQP I DFSEDARPSPCYOLAQRTFR 
TFDFYKKHGETMTPAGLSFFQCRWDDSVTYIFHQLLDMREPVFE 
FVRPPPYHPKOKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2 04 6 


1268 


1GCF1MDGGDDGNLIIKKRFVSEAELDERRKRRQEEWEKVRKPE 
DPEECPEEVYDPRSLYERLQEOKDRKQOEYEEOFKFKNMVRGLD 
EDETNFLDEVSRQQELIEKQRREEELKELKEYRNNLKKVGISQE 
N KKE VE KK LT VKP I ETKN KFSQAKLLAG AVKH KS S ESGN S VKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVClGIIiPGIi 
GAYRGSSDSESSSDSEGTINATGK1VSSIFRTNTFLEAP 


€640 


117 


1043 


VLEP PDVSMAES EDRSLRI VLVGKTGSGKSATANTILGEEI FDS 
RIAAOAVTKNCOKASREWQGRDLLWDTPGLFDTKESLDTTCKE 
ISRCI I S S CPGPHAI VLVLLLGR YTE EEQKT VALI KA V FGKS AM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NS KKTS KAE KES QVQELVE L I EKMVQ CNEGAY FS DD I Y KDTEER 
LK0REEVL.RKI YTDQLNEE I KLVEEDKHKSEEKKEKEI KLLKLK 
YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 




1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGLAWSS 
ASAPPPRGFSAISCTVEGAPASFGKSFAQKSGYFLCLSSLGSLE 
NPQENWAD I QI WDKS PLPLGFS P VCDPMDS KASVS KKKRMCV 
KLLP LG ATDTAVFDVRLSGKTKTVPG YLR I GDMGGFA1 WCKKAK 
APRP VPKPRGLS RDMQGLSLDAASQP SKGGLLERTAS RXGSRAS 
TLRRNDS I YEASSLYGISAMDGVPFTLHPRFEGKSCS PLAFSAF 
GDLTI KSLAD 1 EEEYNYGFWEKTAAARLPPSVS 


6642 


22 


1296 


pleermmtkmdpndqaordi i felrriafdaesdpsnapgsgte 

krkamytkdykmix5ftnhinpamdftqtppgmlal 

hqdtyi r i vlenssredkhecpfgrs ai eltkmlcei lqvgelp 

negrndyhpmffthdrafeelfgi ci qllnktwkemrataedfn 

kvmqwreq i tralpskpnsldqfks klrsls ys e i lrlrqser 

msoddfosppivelrekiqpeilelikqqrlnrlcegssfrkig 

krrroerfwycrlalnhkvlhygdlddnpqgevtfesloekipv 

adikaivtgkdcphmkeksalkqnkevlelafsilydpdetlnf 

iapnkyeyciwidglsallgkdmsseltksdldtllsmemklrl 

ldlen i q i pea p p p i pke p s s yd fv yhyg 


6643 


304$ 


2265 


SLHAPAEGRTRGRLAEKPKMLTRKIKLWDJNAHITCRLCSGYLI 

dattvteclhtfcrsclvkyleenntcptcrivihqshplqyig 

HDRTMODIVYKLVPGIlOEAEf4RKOREFYHKLG^5EVPGDIKGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLECNSSKLRGLKRKWIRCSAOATVLHLKKFIAKKLNLSSFNEL 
D I LCNEE I XjGKDHTLKFVWTRWR FKKAPLLLH YR? KMDLL 


6644 


1489 


290 


FR PLATE P RGS S P VQLVS S TMS VRTL PLLFLNLGG 2MLY I LDQR 
LRAQNI PGDKARKVLNDI I S TMFNRK FME ELFKPQ3L YS KKALR 
TVYER1AHAS I MKIiNQASMDKbYDLMTfOAFKYQVLLCPRPKDVIi 
LVTFNHLDT 1 KGFIRDSPTI LQQVDETLROLTEI YGGLS AGEFQ 
LI RQTLLI FFQDLHI R VS M FLKDKVQNNNGR FVL P VSG P VP WGT 
EVPGLI RMFNNKGESVKRI EFKHGGNYVPAPKEGS FEFYGDRVL 
KLGTNM YSVNQPVETHVSGS SKNLASWTQES I APNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEOAALTRPEELSY 
E V IN I O ATQDQQR S EELAR I MGEFE I TEQPRLS TS KGDDLLAMM 
DEL 
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1 SEC 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amine acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H*=Histidine, I=Isoleucine, K=Lysine, 
L* Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6645 


6530 


4646 


FVEGLAG Y VYKAASEGKVLTLAALLLNRS ESDI R YLLGY VSQOG 
GQRSTPLlIAAIWGHAKVVRLliLEHYRVQTQOTGTVRFDGYVID 
G ATALWCAAGAGH FE WKLL VSHG ANVNH TT VTNS T P LRAACFD 
GR LDI VKY LVENNAN I S I ANKYDNTCLM I AAY KGHTDWRYLLE 
QRAD PNA KAHCGATALH FAA E AG HIDIVKELI KWRAAI WNGHG 
MT P L KVAAE S C KADWE LLLSHADCDR RS R I E ALELLG AS FAND 
REN YDI I KTYH YL YLAMLERFQDGDN I LEXE VLPPIHAYGNRTE 
CRNPOELES I RQDRDALH.M EG1» I VRERI LGADNIDVSHPI I YRG 
AV Y ADNME FEQCI KLWLHALHLRQ KGNRNTH KDLLRFAQV FS QM 
IKLNETVKAPDIECVLRCSVLEI EOSMNRVKNI SDADVHNAMDN 
YECIvTLYTFLYLVCISTKTOCSEEDQCKINKQIYNLIHLDPRTRE 
G FTLLHLAVNSNTPVDDFHTNDVCS FPNALVT KLLLDCGAEVNA 
VDNEGNSALHI IVQYNRPISDFLTLHSII ISLVEAGAHTDMTNK 
QNKTPLDKS TTGVSE I LLKTQMKMSLKCLAARAVRAND IN YQDQ 
IPRTLEEFVGFH 


6646 


17t 


890 


PSSRMNHLPEDMENAL7GSQSSHASLRN1HSINPTQLMARIESY 
EGREKKGI SDVRRTFCLFVTFDLLFVTLLWI I ELNVNGGIENTL 
EKEVMOYDY YS SYFDI FLbAVFRFKVLl IjAY AVCRLRHWWAIAL 
TTAVTS AFLLAKVI LS KLFSQGAFGYVLP IISFI LAWI ETWFLD 
FKVLPQEAEEENRLLI VQDASERAALI PGGLS DGQFYS PPESEA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


S90 


PS S R MNHLPEDMENALTGSQSSHAS LRNI HS I NPTQLMAR IES Y 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWIIELNVNGGIENTL 
EKE VMQYDYYS S YFDI FLLAVFRFKVLI LAYAVCRLRH WWAI AL 
TTJiVTSAFLLAXVILSKLFSQGAFGYVLFI IS FI LAWI ETWFLD 
FKVLPQEAEEENRLLI VQDASERAALI PGGLS DGQFYS PPESEA 
GSEEAEEKQDSEKPLLEL 


6648 


413 


8S7 


RNCWNCFTKYFNSPPEDIDHKDSYLITRSIMAEPDY3EDDNPEL 
I R P0 KLI NP VKTSRNHQDLHRELLMNQ KRGLA PQNK PELQKVME 
KRKRDQVI KQKEEEAQKKKSDLEI ELLKRQQKLEQLELEKQKLQ 
EEOENAPEFVKVKGNLRRTGQEVAOAOES 


6645 


1357 


832 


W1FRAAGIRHEVKWDVKEIMSQHN3YVDALLKEFEOFNRRLNEV 
SKRVRI PLPVSNILWEHCI RLANRTI VEGYANVKKCSKEGRALM 
OLD FQQFLMKLEKLTD I R P I PDKEFVETYI KAYYLTENDMERWI 
K EHR E YSTKQLTNLVNVCLGSHINKKAKQKXLAA I DD I DR PKR 


6650 


3 2 


765 


LVPLVFSLLVOSCKQVYRSIAMKFVPCLLLVTLSCLGTLGQAPR 
QKOG S TGEE FHFQTGGRDS CTMR P SS LGOGAG E VWLRVDCRNTD 
QTY W CEYRGQPSMCQAFAADPKSY WNQALQELRRLHHACQGAP V 
LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 
VKLTEATOLGKDSMEELGKAKPTTRPTAKPTOPGPRPGGNEEAK 
KKAV? EHCWKPFQALCAFL I SFFRG 


66S1 


342b 


1353 


AKELLKVGDFSLCAGPYQNTADTMENLSKEPLAS FVSES FPISA 
CGI AT EHVKIDNSGEGLTAEAGSETLS RDGEVGVNSDMHYELSG 
DSDLDLLGDCRNPRLDLEDSYTLRGSYTRKKDVPTDGYESSLNF 
HNNNOEDWGCS S WVPGKETSLPPGHVJTAAVKKEEKCVPP YVQI R 
DLHG I LR'I*YAN FS ITKEL KDTMRTS HGLR R H PS FSANCGLPSS W 
TSTWQVADD LTQNTLDLE Y LR FAH KL KQT I KNGDS QHSASS ANV 
FPKESPTQISIGAFPSTKISEAPFLHPA^KbKbFL»]jVI VVEbDP 
RP0GOPRRGYTASSLDSSSSWRERCSHNRDLRNSORNHTVSFHL 
N KLKYUST VKE S RNDI SL I LNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRS ASYEDI I IDVCTNLHVKLRS WKEA 
CKSTFLFYLVETEDKSFFVRTKNLLRKGGHTEIEPOHFCQAFHR 
ENDTLI I I I RNEDI SSHLHQ I PSLLKLKHFPS V I FAGVDSPGDV 
LDHTYQELFRAGGFVISDDKI LEAVTLVQLKEI I KI LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 
YHQCDSRSSTKAE I LKCLLNLQIQHI DARFAVLLTDKPTI PREV 
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SEO 
ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
] oca ti on 
corresponding 
to first 
amino acic 
residue ol 
amino acic 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
l»=Leucine, M=Methiomne, N=Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine, V= Valine , 
W^Tryptophan, Y=Tyrosme, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FENNGILVTDVNKFIENIEKIAAPFRSSYW 


6652 


2 


1342 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPbFLPTRPAERAWI RSRRkS E1WGKMEVPRLDHALNS PTSPC 
EEVXKNLSLEAIQLCDRDGNKSODSGIAEMEELPVPHNIKISNI 
TCDSFKISWEMDSKSKDRITKYFIDLNKKENKNSKKFKHKDVPT 
KLVAKAVPLPMTVRGHWFLSPR7EYTVAV0TASK0VDGDYWSE 
WSEIIEFCTADYSKVHLTOLLEKAEVIAGRMLKFSVFYRWQHKE 
YFDYVREHHGNAKQPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 
TGKPPQDSPYGRYRFEIAJ^EKLFNPNTNLYFGDFYCMYTAYHYV 
I LVI APVGS PGDEFCKQRLPQLNSKDNKFLTCTEEDGVLVYHHA 
QDV I LEVI YTDPVDLSLGTVAE 1 TGHQLMSLSTANAKKDPSCKT 
CNISVGR 


6653 


170 


1S10 


FFLEPRLRPFPASRARFVPARTR PS PLKPCCFCFEGGGSMbSPQ 
R VAAAASRGADDAMESS KPG PV0WLVOKDQHS FELDEKALAS J 
LLQDHI RDLDWWS VAGAFR KG KS FI LDFMbRYLYSQKESGHS 
KWLGDPEEPLTGFSWRGGSDPET?GIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSOSTVKDCATIFALSTKTSSVQIYNLSQNIQED 
DLQOLQLFTEYGRLAMDE i FC'KP FQTLMFLVRDWS FPYEYS YGL 
OGGMAFLDKRLQV KEHQHBE 1 0NVRNH1HSCFSDVTCFLLPHPG 
LQ V ATS PDFDG KLKD I AG E FKEQLQAXi I P YVLN PS KLME KEING 
S KVTCRGLLE YFKA Y 1 KI YQGEDLPHPKSMLQATAEAYNLAAAA 
SAKDI YYNNMEEVCGGEKPYLS PDI LEEKHCEFKQLALDHFKKT 
KKMGGKDFSFRYOQELEEEIKELYENFCXHNGSKNVFSTFRTPA 
VLFTGIVALY1ASGLTGFIGLEWAQLFNCMVGLLLIALLTWGY 
IRYSGQYRELGGAIDFGAAYVLEQASSHIGNSTQATVRDAVVGR 
PSMDKKAQ 


6654 


1 


705- 


RTSLSPSQCSSFNlJ^SAGM01bGV\n J TLLGV}VNGliVSCAX l PM 
WKVTAFIGNSIVVAQWWEGLWMSCVVOSTGOMQCKVYDSLIiAL 
PODLOAARAbCVIALLVALFGLLVYLAGAKCTTCVEEKDSKARL 
VLTSGIVFVISGVLTLIPVCWTAHAVIRDFYNPLVAEAOKRSLG 
ASLYbGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 




KDAYKFKKGLLALAIjVFSLPVFAAEHWIDVRVPEQYQQEHVQGA 
INI PLKEVKERIATAVPDKNDTVKVYCNAGRQSGQAKEIbS EMG 
YTHVENAGGbKDIAMPKVKG 


6656 


2 


1212 


TELPPRPANbAIQPPLSPLRALAPLPEKPGAVP.PPQKRMAKVAK 
DLN PG VX KMSLGQLQS ARG VA CLGCKGTCSG FEPHS WRK I CKS C 
KCS QEDH CLTSDLEDDR KI GR ULKDS KYSTLTAR VKGGDG I RI Y 
KRNRM I MTNP 1ATGKDPTFDT3 TYEV7APPGVTQKLGLQYMELIP 
KEKOPVTGTEGAFYRRRObMHObPIYDQDPSRCRGLLENELKLM 
EEFVKQYKSEALGVGEVALPGOGGbPKEEGKQOEKPEGAETTAA 
TTNGS LS DPS KE VE YVCEL CKG AAP P DS P WYSDRAG YN KQWHP 
TCFVCAKCSEPLVDblYFWKDGAPWCGRHYCESLRPRCSGCDEI 
IFAEDYQRVEDLAWHRKHFVCEGCEQLI^GRAYIVTKGQbLCPT 
CSKSKRS 


6657 


830 


212C 


LLTCQERAGDCLLSASTMKEW1.'WSPKKVADWLLENAMPEYCEP 
LEHFTGQDL I NLTOEDF KK P P LCRVS S DNGQRLLDMI ET LKMEH 

HT ,F AT-? KNGHANfSU T .KI T f5VT) T PTP TV? *? FQ T V T Y PN<^M PNG YR KFM 

I KI PMPELERSQYPMEWGKTFLAFLYALSCFVbTTVMI SWHER 
VPPKEVQPPLPDTFFDHFNR VOWAFS I CEINGMILVGLWLIQWL 
bbK^KSIISPJRFFCIVGTbYbYKCITMYVTTLPVPGMHFNCSPK 
LFGDWEAQLRRIMKLI AGGGbS I TGS HNMCGDYLYSGHTVMLTL 
TYLFI KEYS PRRLWWYHWI CWLbS WGI FCILLAHDHYTVDWV 
A Y Y I TTRLFWW YHTMANQQ VbKS ASQMN LLAR VW W Y R P FQYFEK 
NVQGI VPRSYHWPFPWPWHLS RQVKYSRLVNDT 


6658 


36 


85^ 


KCCAU?APGSPYRGLYFSSAAPCTAPRKAKHQSTLEGbTKRMLM 
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SEC 
ID 
KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleoti de. 
iccaticr. 
corresponding 
tc firs: 
amino acid 
residue of 
smino acic 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C*»Cysteine, D-Aspartic Acid, 
Glutamic Acid, F=Fhenylalanine, G=Glycine, 
H=Histidine, I=Isoj eucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








FDPVPVKOEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 
QTPEGLSHGIQMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
SPGLSMPSSSPP I KKYSPPS PGVQPFGVPLSMPPVMAAALSRHG 
IRSPG1LPVI0PWV0PVPFMYTSHLQCPLMVSLSEEMENSSSS 
MQVPVIESYEKPISQKKIKIEPGIEPORTDYYPEEMSPPI>MNSV 
SPPQALLOE 


6659 


16 


523 


EP0RGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNL3SLPGL 
RGLTS I SRNQLCCTNAMRV 1 YQRRW Kl JQNTFLLAT FANWNV 
CGNPTI TCPHMRT LNN CHK £ G VQ V PLM Y CN LTT PS PQNISNCRY 
A0TPANMFYIVACDNRD0RRDPPQYPWPVHLHTI3 


6660 


514 


1707 

1 


CAASLDCRHHl^EPDMKLVWPSAKLLQATiAGASARACDSVTSNV 
LPLLLE0FHKHSOSSORRT3LEMLLGFLKLOOKWSYEDKDQRPL 
NGFKDQLCSLVFKALTDPSTQLQLVGIRTLTVLjGAQPDLLSYED 

lelavgklyrlsflkedsoscrvaaleasgtlaaiiypvafsshl 
vpklaeelrvgesnltngdeptocsrhlccloalsavsthpsiv 
ketlplllchlwovnrgnmvaossdv i av coslrqmaekcqodp 
escwyfhotaipcllalavoasmpekepsvlrkvlledevlaam 
vsvigtatthlsfelaaosvthivpl,fldgnvsflpensfpsrf 

QPFQDGSSGQRRLIALLMAFVCSLPRJJVSEHIWEVLLFNLDKVT 
PG 


6661 


179 


43 0 


GVHAASGTLS ATWJLAEAKMFDS LAKAGKYLGQAAKLM I GMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERODAKYGGKGGARCC 


6662 


165 


4 2:- 


R S LP KPA PAQPAS 3 H CAR FSG VT P PTA KTAMS DGNTA FNALM Y C 
GPKADDGNI FSACAPASSAVKASVSVAQPGQAVIP 


6663 


3 


100 f 


RPVTLSSRVDDFVPPLPETSGRRKJtLERMYSVDRVSDDIPIRTWF 
PKENLFSFQTASTTMOAI SNFR KHLRMVGSRR VKAOTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVML.1SSKVPKAEY3 PTI IRRDDPSI I 
PILYDHEHATFEDILEEI ERKLNVYHKGAKI WKMLI FCQGGPGK 
LYLLKNKVATFAK VEKEEDMI KFWKRLS K LMS KVNPEPNVIHI M 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAOGKQMIET 
YFDFRLYRLWKSRQKS KLLDFDDVL 


6664 


58 


96* 


PRLLRLPRSVWMDSPWDELALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNP I SS WFTAMLHCFGGGI LS CLLLAE P PLKF 
LANKTNI LLAS S I WY 3 TFFCPHDLVSQGYSYLPVQLLASGMKE V 
TRTWK I VGGVTHANSY YKNGW I VMI AI GWARGAGGT1 ITNFERL 
VKGDWKPEGDEWLKKSYPAKVTLLGSVIFTFQHTQHLAISKHNL 
MFLYTI FIVATKI TMMTTQTS TMTFAPFE DTLS WMLFG WQQ PFS 
SCEKKSEAKSPSNGVGSLASKPVDVASDNVKKKHTKKNE 


666S 


171 


127fc 


BERRLACRQWTQQRSEL.Y PGFQKRQRFLF KAGEEAAAQGGRKL 
PGRWLGPGCTONPCSVHTATGPEPRKLPLLPPDSPNSGYPKEPA 
ALCPG I PSPCRMTHQDLS I TAKLINGGVAGLVGVTCVFPI DLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKA1KIAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVVVTC 
PMEKLKIOLQDAGRLAVHHOGSASAPSTSRSYTTGSASTHRRPS 
ATLI AWELLRTOGLAGLYRGLGATLLRDI PFS 1 1 YFPLFANLNN 
LGFNELAGKASFAHSFVSGCVAGSIAAVAVTPLDVLKTRIQTLK 


6666 


4 98 


2866 


MTTFLPVPOMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTL 
WSPGCWPQPIQKEGVGLWD I R KPOSSLLRYGGNLSLOSAMSVRF 
NSNGTQLLALRRRLPPVLYDIHSRLPVFOFDNOVYFNSCTMKSC 
CFAGDRDQ Y I LSGSDDFNiiYMWR I PADPEAGG I GR WNGAFMVL 
KGHRSIVNQVRFNPHTYMICSSGVEKIIKIWSPYKOPGCTGDLD 
GRI E DDS RCLYTH EE Y I S LVLNSGS G LSH DY ANQS VQED PRMMA 
FFDSLVRREIEGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
SASLPRSFPPTVDESADKAFHLGPLRVTTTNTVASTPPTPTCED 
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NO: 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acic 
sequence 


Prpriicfpri pnd 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino ari ccc n\P TV t"_ Ci^,r\ t- s -i tv i r\rt ciiavisl T^Tit- •! nr> l 

lA=Alanine, OCycteine, D=Aspartic Acid, E= | 
Glutamic Acid, F«Phenylalanine, G^Glycine , 
H»Hietidine, 3«Isoleucine , K=liysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine. T= Threonine, v=Valine, i 
Ws Tryptophan, Y=Tyrosine, X= Unknown, *=Stop j 
Codon, /=possible nucleotide deletion, 1 
\=possible nucleotide insertion) 








PRSPSPEDESSSSSSSSSSEDEEELNERRASTWQRNAMRRRQKT 
TREDKPSAPI KPTNTY IGEDKYDYPQIKVDDLSSSPTSS PERST 
STLEIOPSRASPTSDIESVERKiyJCAyKWLRYSYISYSNNKDGE 
TSbVTGEADEGRAGTSHKDNPAPSS SKEACLN I AMAQRNODLP F 
EGCSKDTFKEETPRTPSW6PGKEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETICANHNNGRLHPRPPHPHNMGONLGELEW 
AVSSPGHSDTDRDNSSLTGTLLHKDCCGSEMACSTPNAGTRKDP 


6667 


373 


1310 


AEEVERLAAMRSDSLVPGTHTPP1RRRSKFANLGRIFKPWKWRK 
KKSEKFKHTSAALERKISMR0SREELIKRGVLKEIYDKDGSLS3 
SNLEIJSIjENGQSLSSSQLSIjPALSEMEP VPMPRDPCbYEV]jOPS 
Dl MDGPDPGAPVKLPCLPVKLSPPbPPKKVMI CMPVGGPDLSJjV 
S YTAQKSGQQGVAOHHHTVLPSQI QHQUQYGSHGQHIiPS TTGSL 
FMHPSGCRM3DELNKTLAMT«CRLESSEQRVPCSTSYHSSGLHS 
GDGVTKAGPMGLPEIRQVPTW1ECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKNI LPRQTDEERLELRQQIGTKL 


6668 


714 


3S8 


TLAVATGPALT LRCHVCTS S SNC KHS WCFAS SR F CKTTNTV EP 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCOEDLCNEXbH 
NAAPTRTALAHSALSLGLALSLlAVIliAPSL 


6669 


■ ■ ■■ 

45S 


1207 


KDEETRKDYDYMLDHPEEYYSHYYKYYSRRLAPKVDVRWI L>VS 
VCAI S V FQFFS WWN S YNKA 1 SY LAT V P KY RI QAT E I AKQQGLLK ! 
KAKEKGKNKKSKEEIRDEEENIIKNI3KSKID3KGGYOKP0ICD ! 
LLLFQ3 1 LAP FHLCS Y I VWYCRWI YNFNI KGKEYGSEERLYI I R 
KSMKMSKSQFDSLEDHQKETFLKRELWIKENYEVYKOEQEEELK 
KKLANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 


384 


594 


VARI*GEA i U<MSSEPPPPyPGGPTAPLl;EEKSGAPPTPGRSSPA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM 
PPGFYP PPGPHPPMG Y YPPG PYTPG PYPGPGGHTATVbV P SGAA 
TTVTV 


6671 


1 


763 


LPAEXPRSAPNMAGGRCGPQLTALLAAWIAWAATAGPEEAALP 
PEQSRVQPMTASNWTliVMEGEWMLKFYAPWCPSCQOTDSEWEAF 

ti VKir'i? t t j*vt c?i7r , v\rrMrTr\trTV , T cr , DT?irvp'PTvD2i.i?PWa.vrvzT ppb 
AWMoti J^ibV\jjvViJvxy^lrv»iji>V5Krr v 1 1 t>r\ftr r J1>uvia?i r xtK 

YRGPG I FEDLQNY3 LEKKWQS VEPbTGWXS PASLTMSGMAGLFS 

I SGKIWHLHNYFTVTLGIPAWCS YVFFVIATLVFGLSMDLVL* V 

ISOCNWDPPYRHVS* / RPSTNLG VHTAHTS EHLRL 


6672 


304 


1089 


APGSKPV0FMDFEGKTSFGMSVFNLSNA1I«3SGILGLAYAKAHT 
GV 1 FFLAtiLLCIALLS S YSIHLLLTCAGIAGI RAYEQLGQRAFG 
PAGKWVATVICLHNVGAMSSYLFIIKSELPLVIGTFLYMDPEG 
DW FLKGNLLI 1 1 VS VLI I LPLALMKHLGYIXSYTSGLSLTCKLFF 
LVSVIYKKFQLGLCYRATMKQOWESEALVGTPOPRDSTAAVKAO 
MFH5 * I»TG VLTQ WP I MA FAFVCKPGGAG PS I TELCRA FQAQD 


6673 


1116 


1963 


LQ I QTHHTHHGARVTHLGSHQLLANAGTMLCRQQSS SMAPAFSQ 
SVT CGP S PCVR KQES ATKCLH 1 GACGSDLWARGWEQG* G* GLNV 
WI.CPCVAFERGARP0AEEGGARV7NSLVSSPWI PPNP* HSS 1 GAE 
NAVPRP*QG+ KVNPSGQERQS \V7VLPLPVPGEPLKLPGLPG*NK 
S FSRV/SGS KGKWI LPRQLM* AS* R\TPRFVPGTQWVPI TW/ PL 
ITWH*SAPTPPLKACPAPRESDPCSSCLSCPCVTOKPRFSDTGW 
FGAGHCHS S CDFTR KGAAGGPG 


6674 


1 


440 


LEFDYMCQY DYVEVRDGDNPJDGQI IKRVCGNERPAP IQSIGSSL 
HVLFHSDGSKiaFDGFHAIYEEITACSSSPCFHDGTCVLDKAGSY 
KCACLAGYTGORCENLLEERNCSDPG/WPSQWVPENNRGPWAYO 
PTPC* IGTRVAFFLT 
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